INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property Organization 
International Bureau 

(43) International Publication Date 
22 March 2001 (22.03.2001) 




PCT 



(10) International Publication Number 

WO 01/19842 Al 



I (51) International Patent Classification 7 : C07H 21/02, 

21AM C12N 5/00, 5/02, 15/00, 15/09, 15/63, 15/70, 
\Sni C07K 1/00, 14AX), 16/00, 17/00, C08H 1/00, C12P 
2l/Oo! A01K 67/00, 67/033 

(21) International Application Number: PCI7USOO/25558 

(22) International Filing Date: 

1 ' 18 September 2000 (18.09.2000) 



(25) Filing Language: 

(26) Pablication Language: 



English 
English 



(30) Priority Data: 

09/399,079 17 September 1999 (17.09.1999) US 

(63) Related by continuation (CON) or continuation-in-part 
s (CIP) to earlier application: 

= \j S 09/399,079 (CIP) 

= 1^ on 17 September 1999 (17.09.1999) 

a (71) Applicant (for all designated States except US): GEN- 
= Z YME TRANSGENICS CORPORATION [US/US]; 
s 175 Crossing Boulevard, Framingham, MA 01701-9322 
a (US). 

S (72) Inventors; and 

=jj (75) Inventors/Applicants tffcr US onfe): POLLOCK, Dan 



[US/US]; 4 Redgate Drive, Medway, MA 02053 (US). 
MEADE, Harry, M. [US/US]; 62 Grasmere Street, 
Newton, MA 02458 (US). BOSSLET, Klaus [DE/DE]; 
Schering AG, 13342 Berlin (DE). 

(74) Agent: MYERS, Louis; Fish & Richardson EC, 225 
Franklin Street, Boston, MA 021 10-2804 (US). 

(81) Designated States (national): AE, AG, AL, AM, AT, AU, 
AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CR, CU, CZ, 
DE, DK, DM, DZ, EE, ES, FL, GB, GD, GE, GH, GM, HR, 
HU,ID,IL,IN,IS,JP,KE,KG,KP,KR,KZ,LC, LK.LR, 
LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, MX, MZ, 
NO, NZ, PU FT, RO, RU, SD, SB, SG, SI SK, SL, TJ, TM, 
TR, TT, TZ, UA, UG, US, UZ, VN, YU,ZA.ZW. 

(84) Designated States (regional): ARIPO patent (GH, GM, 
KE, LS, MW, MZ, SD, SU SZ, TZ, UG, ZW), Eurasian 
patent (AM, AZ, BY. KG, KZ, MD, RU, TJ. TM), European 
patent (AT, BE, CH, CY. DE DK, ES, FI, FR, GB, GR, IE, 
IT, LU, MC, NL, PT, SE), OAPI patent (BF, BJ, CF, CG, 
CI, CM, GA, GN, GW, ML, MR, NE, SN, TD, TG). 

Published: 

— With international search report 

For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations" appearing at the begin- 
ning of each regular issue of the PCT Gazette. 



00 
On 



(54) Title: SUBUNTT OPTIMIZED FUSION PROTEINS 

i 

I m Abstrsct: A method of making a fusion protein having: a fir* member, fused to a second member wherein the fini: and ^second 
! m^ aTcrln such mat the fusion protein assembles into a complex having a numb^ of subum^ whtch opdm.res actnnty of 
' the multimeric form of the second member. 



WO 01/19842 



PCIYUSOO/25558 



SUBUN1T OPTIMIZED FUSION PROTEINS 

5 Related Applications 

This application claims the benefit of a previously filed Provisional Application No, 
60/101,083 filed September 1 8, 1998, which is hereby incorporated by reference. 

j g Field of the Invention 

The invention relates to a fusion protein having a first and a second member, 

wherein the second member of the fusion protein assembles into a multimer and the other 

member is chosen, or modified, such that it promotes assembly of the second member into a 

preselected or an optimal number of subunits. 

15 

Background of the Invention 

Fusion proteins can combine useful properties of distinct proteins. E.g., a fusion 
protein can combine the targeting property of an antibody molecule with the cytotoxic effect 
of a toxin, 

20 

Summary of the Invention 

In general, the invention features, a method of making a fusion protein having: a 
first member, e.g., a targeting moiety, e.g., an immunoglobulin subunit (e.g., an 
immunoglobulin heavy chain or light chain, or a fragment of either) fused to a second 

25 member, e.g., an enzyme, e.g., a toxin (e.g., an enzyme or toxin subunit). The first and 
second members are chosen such that the fusion protein assembles into a complex having a 
number of subunits which optimizes activity of the multimeric form of the second member. 
In preferred embodiments the first member, or the fusion protein, assembles into a form 
having the same number of subunits as are present in an active, e.g., native, form of the 

30 second member. In preferred embodiments the first member, or the fusion protein, 

assembles into a form having fewer subunits than are present in an active, e.g., native, form 
of the second member. 
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In preferred embodiments, the fusion protein assembles into a complex, e.g., a di-, 
tri-, tetra-, or higher multi-meric complex. Preferably, the fusion protein assembles into a 
dimer or a tetramer. 

In preferred embodiments, the fusion protein assembles into a complex having 
1 5 enzymatic activity. 

In a preferred embodiment, the first member is a monomer. E.g., it is a species 
which is normally monomelic, or which has been modified, e.g., by mutation of a site which 
modulates formation or maintenance of a multimer of subunits. In some embodiments the 
monomeric form is useful because it does not prevent formation of a multimer by the 
10 second member. 

In another preferred embodiment, the first member is a forms a dimmer, e.g., a 
heterodimer or homodimer. E.g., it is a species which is normally dimeric, or which has 
been modified, e.g., by mutation of a site which modulates formation or maintenance of a 
multimerof subunits, to be dimeric. In some embodiments the dimeric form is useful 
1 5 because it does not prevent formation of a multimer by the second member. 

In preferred embodiments, the fusion protein has the formula: Rl -L-R2; R2-L-R1 ; 
R2-R1; or R1-R2, wherein Rl is a first member, e.g., an immunoglobulin subunit, L is a 
peptide linker and R2 is a second member, e.g., an enzyme subunit. Preferably, Rl and R2 

20 arc covalently linked, e.g., directly fused or linked via a peptide linker. 

In preferred embodiments, the first or the second member of the fusion protein, or 
both are modified by, e.g., substituting or deleting, a portion of the amino acid sequence. 
In a particularly preferred embodiment the fusion protein includes a first member which is 
an Ig superfamily member, preferably an Ig subunit, which has been modified to inhibit 

25 formation of a multimeric form, e.g., a tetrameric form. Preferably the modification, which 
can be a change, insertion, or deletion of one or more amino acid residues, results in a 
subunit which does not form a multimer or which forms a lower order multimer that it 
normally would form, e.g., it forms a dimer rather than a tetramer. 
Preferably, a region which mediates formation or maintenance of a multimeric structure is 

30 modified and thereby wholly or partly inactivated. E.g., a portion of an immunoglobulin 
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subunit , e.g., a heavy chain, e.g., the hinge region, is modified, e.g., deleted. In those 
embodiments where the hinge region of the immunoglobulin is modified, e.g., removed, the 
modified immunoglobulin is monovalent. 

In preferred embodiments, the modification of the first member inhibits the 

5 assembly of the first member, or the fusion protein into a multimer, e.g., results in the 
production of a monomer, or, e.g., of a dimer, where a higher order multimer would 
otherwise be formed. 

In preferred embodiments, the first member is a targeting agent, e.g., a polypeptide 
having a high affinity for a target, e.g., an antibody, a ligand, or an enzyme. 

1 0 In preferred embodiments, the first member is an immunoglobulin or a fragment 

thereof, e.g., an antigen binding fragment thereof. Preferably, the immunoglobulin is a 
monoclonal antibody, e.g., a human, murine (e.g., mouse) monoclonal antibody; or a 
recombinant monoclonal antibody. Preferably, the monoclonal antibody is a human 
antibody. In other embodiments, the monoclonal antibody is a recombinant antibody, e.g., a 

1 5 chimeric or a humanized antibody (e.g., it has a variable region, or at least a 

complementarity determining region (CDR), derived from a non-human antibody (e.g., 
murine) with the remaining portion(s) are human in origin); or a transgenically produced 
human antibody (e.g., an antibody produced by a hybridoma which includes a B cell 
obtained from a transgenic non-human animal, e.g., a transgenic mouse, having a genome 

20 comprising a human heavy chain transgene and a light chain transgene fused to an 
immortalized cell). 

In preferred embodiments, the first member is a full-length antibody (e.g., an IgGl 
or IgG4 antibody) or includes only an antigen-binding portion (e.g., a Fab, F(ab')2, Fv or a 

single chain Fv fragment). 
25 In preferred embodiments, the first member is an immunoglobulin subunit selected 

from the group consisting of a subunit of: IgG (e.g., IgGl, IgG2, IgG3, IgG4), IgM, IgAl, 
IgA2, IgA.sub.sec, IgD, of IgE. Preferably, the immunoglobulin subunit is an IgG isotype, 
e.g.,IgG3. 
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In preferred embodiments, the first member is a monomer, e.g., a single chain 
antibody; or forms a dimer, e.g., a dimer of an immunoglobulin heavy chain and a light 
chain. 

In preferred embodiments, the first member is a monovalent antibody (e.g., it 
5 includes one pair of heavy and light chains, or antigen binding portions thereof). In other 
embodiments, the first member is divalent antibody (e.g., it includes two pairs of heavy and 
light chains, or antigen binding portions thereof). 

In preferred embodiments, the first member includes an immunoglobulin heavy 
chain or a fragment thereof, e.g., an antigen binding fragment thereof. Preferably, the 
10 immunoglobulin heavy chain or fragment thereof (e.g., an antigen binding fragment thereof) 
is linked, e.g., linked via a peptide linker or is directly fused, to an enzyme. Preferably, the 
immunoglobulin heavy chain-enzyme fusion protein is capable of assembling into a 
functional complex, e.g., a di-, tri-, terra-, or multi-meric complex having enzymatic 
activity. The most preferred form is dimeric 
1 5 In preferred embodiments, the first member includes an immunoglobulin heavy 

chain or fragment thereof (e.g., an antigen binding fragment thereof), and a light chain or a 
fragment thereof (e.g., an antigen binding fragment thereof). Preferably, the 
immunoglobulin heavy chain is linked, e.g., linked via a peptide linker or directly fused, to 
an enzyme. Preferably, the fused immunoglobulin heavy chain -enzyme fusion protein 
20 assembles with a light chain, e.g., to produce a functional complex, e.g., a di-, tri-, tetra-, or 
multi-meric complex having enzymatic activity. The most preferred form is dimeric. 

In preferred embodiments, the first member is an immunoglobulin that interacts with 
(e.g., binds to) a cell surface antigen on a target cell, e.g., a cancer cell. For example, the 
immunoglobulin binds to a tumor cell antigen, e.g., carcinoembryonic antigen (CEA), TAG- 
25 72, her-2/neu, epidermal growth factor receptor, transferrin receptor, among others. 
In preferred embodiments, the first member localizes, e.g., increases the 
concentration of, a fusion protein in proximity to a target cell, e.g., a cancer cell. 

In preferred embodiments, the second member is a subunit of an enzyme, e.g., an 
enzyme having one or more subunits (e.g., catalytic subunits). Preferably, the enzyme 
30 include one, preferably two, more preferably three, most preferably four subunits. A 
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prcfcrTed enzyme is beta-glucuronidase, e.g., a human beta-glucuronidase. The enzyme can 
be a homo-, or a hetero-multimer. If the enzyme is a heteromultimer, two (or more) fusion 
proteins are needed to form the active product. 

In preferred embodiments, the second member is capable of converting a precursor 
5 drug, e.g., a prodrug, to a toxic drug. 

In preferred embodiments, the first member is an immunoglobulin G (IgG) heavy 
and light chains, and the second member is human beta-glucuronidase fusion protein. 

In preferred embodiments, the light chain of the first member has an amino acid 
sequence as shown in Figure IB (SEQ ID NO:2); the light chain of the first member has an 
1 0 amino acid sequence at least 60%, 70%, 75%, more preferably at least 85%, more 

preferably at least 90%, more preferably at least 95%, most preferably at least 98%, 99% 
sequence identity or homology with an amino acid sequence from Figure IB (SEQ ID 
NO:2). 

In preferred embodiments, the light chain of the first member has an amino acid 
1 5 sequence that is encoded by a nucleotide sequence as shown in Figure 1 B (SEQ ID NO: 1 ), 
or Figure 2 (SEQ ID NO:37); the light chain of the first member has an amino acid 
sequence that is encoded by a nucleotide sequence at least 60%, 70%, 75%, more preferably 
at least 85%, more preferably at least 90%, more preferably at least 95%, most preferably at 
least 98%, 99% sequence identity or homology with a nucleotide sequence shown in Figure 
20 IB (SEQ ID NOs:2, 3, or 4), or Figure 2 (SEQ ID NO:37); the light chain of the first 
member has an amino acid sequence that is encoded by a nucleotide sequence that is 
capable of hybridizing under stringent conditions to the nucleotide sequence shown in 
Figure IB. 

In preferred embodiments, the heavy chain of the first member has an amino acid 
25 sequence as shown in Figure 4B (SEQ ID NO:6, 7, 8, 9, 1 0 and/or 1 1), or Figure 5 (SEQ ID 
NOs:13, 14, 15 and/or 16); the heavy chain of the first member has an amino acid sequence 
at least 60%, 70%, 75%, more preferably at least 85%, more preferably at least 90%, more 
preferably at least 95%, most preferably at least 98%, 99% sequence identity or homology 
with an amino acid sequence from Figure 4B (SEQ ID NO: 6, 7, 8, 9, 10 and/or 1 1), or 
30 Figure 5 (SEQ ID Nos:13, 14, 15 and/or 16). 
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In preferred embodiments, the heavy chain of the first member has an amino acid 
sequence that is encoded by a nucleotide sequence as shown in Figure 4B (SEQ ID NO: 5), 
or Figure 5 (SEQ ID NO: 1 2); the heavy chain of the first member has an amino acid 
sequence that is encoded by a nucleotide sequence at least 60%, 70%, 75%, more preferably 

5 at least 85%, more preferably at least 90%, more preferably at least 95%, most preferably at 
least 98%, 99% sequence identity or homology with a nucleotide sequence shown in Figure 
4B (SEQ ID NO:5), or Figure 5 (SEQ ID NO: 12); the heavy chain of the first member has 
an amino acid sequence that is encoded by a nucleotide sequence that is capable of 
hybridizing under stringent conditions to the nucleotide sequence shown in Figure 4B, or 5. 

10 In a preferred embodiment, the fusion protein includes a peptide linker and the 

peptide linker has one or more of the following characteristics: a) it allows for the rotation 
of the first and the second member relative to each other, b) it is resistant to digestion by 
proteases; c) it does not interact with the first or the second; d) it allows the fusion protein to 
form a complex (e.g., a di-, tri-, tetra-, or multi-meric complex) that retains enzymatic 

1 5 activity; and e) it promotes folding and/or assembly of the fusion protein into an active 
complex. 

In a preferred embodiment: the fusion protein includes a peptide linker and the 
peptide linker is 5 to 60, more preferably, 10 to 30, amino acids in length; the peptide linker 
is 20 amino acids in length; the peptide linker is 17 amino acids in length; each of the amino 
20 acids in the peptide linker is selected from the group consisting of Gly , Ser, Asn, Thr and 
Ala; the peptide linker includes a Gly-Ser element. 

In a preferred embodiment, the fusion protein includes a peptide linker and the 
peptide linker includes a sequence having the formula (Ser-Gly-Gly-Gly-Gly)y wherein y is 
1 2, 3, 4, 5, 6, 7, or 8. Preferably, the peptide linker includes a sequence having the 
25 formula (Ser-Gly-Gly-Gly-Gly)3 . Preferably, the peptide linker includes a sequence having 
the formula ((Ser-Gly-Gly-Gly-Gly)3-Ser-Pro). 

In preferred embodiments, the fusion protein is produced recombinantly, e.g., 
produced in a host cell (e.g., a cultured cell), or in atransgenic animal, e.g., a transgenic 
mammal (e.g., a goat, a cow, or a rodent (e.g., a mouse). 
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In preferred embodiments, the fusion protein is produced in a transgenic mammal 
(e.g., a goat, a cow, or a rodent (e.g., a mouse). Thus, the method further includes: 
providing a transgenic animal, which includes a transgene which provides for the 
expression of a fusion protein described herein; allowing the transgene to be 
' 5 expressed; and, preferably, recovering fusion protein, from the milk of the 

transgenic mammal. 

For embodiments where the fusion protein is produced transgenically, the fusion 

protein can further include: 

a signal sequence which directs the secretion of the fusion protein, e.g., a signal 
1 0 from a secreted protein (e.g., a signal from a protein secreted into milk; or an 
immunoglobulin secretory signal); and 

(optionally) a sequence which encodes a sufficient portion of the amino terminal 
coding region of a secreted protein, e.g., a protein secreted into milk, or an immunoglobulin, 
to promote secretion, e.g., in the milk of a transgenic mammal, of the fusion protein. 
1 5 In preferred embodiments, the fusion protein is made in a mammary gland of the 

transgenic mammal, e.g., a ruminant, e.g., a goat or a cow. 

In preferred embodiments, the fusion protein is secreted into the milk of the 
transgenic mammal, e.g., a ruminant, e.g., a dairy animal, e.g., a goat or a cow. 

In preferred embodiments, the fusion protein is secreted into the milk of a transgenic 
20 mammal at concentrations of at least about 0.1 mg/ml,0.5 mg/ml, l.Omg/ml, 1.5mg/ml,2 
mg/ml, 3 mg/ml, 5 mg/ml or higher. 

In preferred embodiments, the fusion protein is made under the control of a 
mammary gland specific promoter, e.g., a milk specific promoter, e.g., a milk serum protein 
or casein promoter. The milk specific promoter can be a casein promoter, beta 
25 lactoglobulin promoter, whey acid protein promoter, or lactalbumin promoter. Preferably, 
the promoter is a goat P casein promoter. 

In preferred embodiments, the transgene encoding the fusion protein is a nucleic 
acid construct which includes: 

(a) optionally, an insulator sequence; 
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(b) a promoter, e.g., a mammary epithelial specific promoter, e.g., a milk protein 
promoter, 

( C ) a nucleotide sequence which encodes a signal sequence which can direct the 
secretion of the fusion protein, e.g. a signal from a milk specific protein, or an 

5 immunoglobulin; 

(d) optionally, a nucleotide sequence which encodes a sufficient portion of the 
amino terminal coding region of a secreted protein, e.g. a protein secreted into milk, or an 
i^nunoglobulin, to allow secretion, e.g., in the milk of a transgenic mammal, of the fuaon 
protein; 

10 (e) one or more nucleotide sequences which encode a fusion protein, e.g., an 

immunoglobulin-enzyme fusion protein as described herein; and 

(f) (optionally) a 3' untranslated region from a mammalian gene, e.g., a mammary 
epithelial specific gene, (e.g., a milk protein gene). 

In preferred embodiments, elements a (if present), b, c, d (if present), and f of the 
15 transgene are from the same gene; the elements a (if present), b, c, d (if present), and f of the 
transgene are from two or more genes. For example, the signal sequence, the promoter 
sequence and the 3 ' untranslated sequence can be from a mammary epithelial specific gene, 
e.g., a milk serum protein or casein gene (e.g., a f> casein gene). Preferably, the signal 
sequence, the promoter sequence and the 3' untranslated sequence are from a goat P casern 

20 gene. . 

In preferred embodiments, the promoter of the transgene is a mammary ep.thelial 

specific promoter, e.g., a milk serum protein or casein promoter (e.g., a P casein promoter). 

U* milk specific promoter can be a casein promoter, beta lactoglobulin promoter, whey 

add protein promoter, or lactalbumin promoter. Preferably, the promoter is a goat P casern 

25 promoter. 

In preferred embodiments, the signal sequence encoded by the transgene » an amino 
terminal sequence which directs the expression of the protein to the exterior of a cell, or into 
the cell membrane. For example, the signal sequence can be obtained from an 
immunoglobulin protein. Preferably, the signal sequence is from a protein which ,s secreted 
30 into the milk, e.g., the milk of the transgenic animal. 
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In preferred embodiments, the one or more nucleotide sequences encoding a fusion 
protein include one or more of: a nucleotide sequence encoding a first member, e.g., an 
immunoglobulin heavy chain (or an antigen binding portion thereof) operably linked to a 
second member, e.g., an enzyme; (optionally) a nucleotide sequence encoding an 
1 5 immunoglobulin light chain (or an antigen binding portion thereof), or both. In one 

embodiment, the nucleotide sequences encoding the heavy chain fusion and the light chain 
are operatively linked in a single construct, e.g., a single cosmid. In another embodiment, 
the nucleotide sequences encoding the heavy chain fusion and the light chain are introduced 
into a transgenic animal in separate constructs. Preferably, when linked, the nucleotide 

10 sequences are arranged in the following order 

5'-Nl-3' linked to 5'-N2-3*; or 5'-N2-3' linked to 5'-Nl-3* wherein Nl is a first member, 
e.g., an immunoglobulin heavy chain (or an antigen binding portion thereof) operably linked 
to a second member, e.g., an enzyme; and N2 is an immunoglobulin light chain (or an 
antigen binding portion thereof). The nucleotide sequences can be in any orientation with 

1 5 respect to each other, e.g., sense/sense; reverse/reverse; sense/reverse; or reverse/sense. 

In preferred embodiments, the 3' untranslated region of the transgene includes a 
polyadenylation site, and is obtained from a mammalian gene, e.g., a mammary epithelial 
specific gene, e.g., a milk serum protein gene or casein gene. The 3' untranslated region 
can be obtained from a casein gene (e.g., a p casein gene), a beta lactoglobulin gene, whey 

20 acid protein gene, or lactalbumin gene. Preferably, the 3' untranslated region is from a goat 
P casein gene. 

In preferred embodiments, the transgene, e.g., the transgene as described herein, 
integrates into a germ cell and/or a somatic cell of the transgenic animal. 

25 In another aspect, the invention features, a method for providing a transgenically 

produced fusion protein, e.g., a fusion protein as described herein, in the milk, of a 
transgenic mammal. The method includes obtaining milk from a transgenic mammal, 
which includes a fusion protein encoding transgene, e.g., one which has been introduced 
into its germline, e.g., a nucleic acid construct as described herein, that result in the 
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expression of the protein-coding sequence of fusion protein in mammary gland epithelial 
cells, thereby secreting the fusion protein in the milk of the mammal. 

In preferred embodiments the transgenic mammal is selected from the group 
consisting of sheep, mice, pigs, cows and goats. The preferred transgenic mammal is a 
5 goat 

In preferred embodiments, the fusion protein is secreted into the milk of a transgenic 
mammal at concentrations of at least about 0.1 mg/ml, 0.5 mg/ml, 1.0 mg/ml, 1.5 mg/ml, 2 
mg/ml, 3 mg/ml, 5 mg/ml or higher. 

In preferred embodiments, the transgene encoding the immunoglobulin-enzyme 
10 fusion protein is a nucleic acid construct which includes: 

(a) optionally, an insulator sequence; 

(b) a promoter, e.g., a mammary epithelial specific promoter, e.g., a milk protein 
promoter, 

(c) a nucleotide sequence which encodes a signal sequence which can direct the 
15 secretion of the fusion protein, e.g., a signal from a milk specific protein, or an 

immunoglobulin; 

(d) optionally, a nucleotide sequence which encodes a sufficient portion of the 
amino terminal coding region of a secreted protein, e.g., a protein secreted into milk, or an 
immunoglobulin, to allow secretion, e.g., in the milk of a transgenic mammal, of the non- 
20 secreted protein; 

(e) one or more nucleotide sequences which encode a fusion protein, e.g., a fusion 

protein as described herein; and 

(f) optionally, a 3' untranslated region from a mammalian gene, e.g., a mammary 
epithelial specific gene, (e.g., a milk protein gene). 

25 In preferred embodiments, elements a (if present), b, c, d (if present), and f of the 

transgene are from the same gene; the elements a (if present), b, c, d (if present), and f of the 
transgene are from two or more genes. For example, the signal sequence, the promoter 
sequence and the 3' untranslated sequence can be from a mammalian gene, e.g., a mammary 
epithelial specific gene, e.g., a milk serum protein or casein gene (e.g., a p casein gene). 



WO 01/19842 PCT7US00/25558 

-11 - 

Preferably, the signal sequence, the promoter sequence and the 3' untranslated sequence are 
from a goat p casein gene. 

In preferred embodiments, the promoter of the transgene is a mammary epithelial 
specific promoter, e.g., a milk serum protein or casein promoter (e.g., a P casein promoter). 
1 5 The milk specific promoter can be a casein promoter, beta lactoglobulin promoter, whey 
acid protein promoter, or lactalbumin promoter. Preferably, the promoter is a goat p casein 
promoter. 

In preferred embodiments, the signal sequence encoded by the transgene is an amino 
terminal sequence which directs the expression of the protein to the exterior of a cell, or into 

10 the cell membrane. Preferably, the signal sequence is from a protein which is secreted into 
the milk, e.g., the milk of the transgenic animal. 

In preferred embodiments, the one or more nucleotide sequences encoding an fusion 
protein include one or more of: a nucleotide sequence encoding an immunoglobulin heavy 
chain (or an antigen binding portion thereof) fused to an enzyme; a nucleotide sequence 

15 encoding an immunoglobulin light chain (or an antigen binding portion thereof), or both. In 
one embodiment, the nucleotide sequences encoding the heavy chain fusion and the light 
chain are operatively linked in a single construct, e.g., a single cosmid. In another 
embodiment, the nucleotide sequences encoding the heavy chain fusion and the light chain 
are introduced into a transgenic animal in separate constructs. Preferably, when linked, the 

20 nucleotide sequences are arranged in the following order: 

5^1-3' linked to 5'-N2-3'; or 5'-N2-3' linked to 5'-Nl-3' wherein Nl is an 
immunoglobulin heavy chain (or an antigen binding portion thereof) linked to an enzyme; 
and N2 is an immunoglobulin light chain (or an antigen binding portion thereof). The 
nucleotide sequences can be in any orientation with respect to each other, e.g., sense/sense; 

25 reverse/reverse; sense/reverse; or reverse/sense. 

In preferred embodiments, the 3' untranslated region of the transgene includes a 
polyadenylation site, and is obtained from a mammalian gene, e.g., a mammary epithelial 
specific gene, (e.g., a milk serum protein gene or casein gene). The 3' untranslated region 
can be obtained from a casein gene (e.g., a p casein gene), a beta lactoglobulin gene, whey 
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acid protein gene, or lactalbumin gene. Preferably, the 3' untranslated region is from a goat 
P casein gene. 

In preferred embodiments, the transgene, e.g., the transgene as described herein, 
integrates into a germ cell and/or a somatic cell of the transgenic animal. 

'5 

In another aspect, the invention features, a transgene, e.g., a nucleic acid construct, 
preferably, an isolated nucleic acid construct, which includes: 
(a) optionally, an insulator sequence; 
10 (b) a promoter, e.g., a mammary epithelial specific promoter, e.g., a milk protein 

promoter, 

(c) a nucleotide sequence which encodes a signal sequence which can direct the 
secretion of the fusion protein, e.g. a signal sequence from a milk specific protein, or an 
immunoglobulin; 

15 (d) optionally, a nucleotide sequence which encodes a sufficient portion of the 

amino terminal coding region of a secreted protein, e.g. a protein secreted into milk, or an 
immunoglobulin, to allow secretion, e.g., in the milk of a transgenic mammal, of the fusion 
protein protein; 

(e) one or more nucleotide sequences which encode a fusion protein, e.g., a fusion 

20 protein as described herein; and 

(f) optionally, a 3' untranslated region from a mammalian gene, e.g., a mammary 
epithelial specific gene, (e.g., a milk protein gene). 

In preferred embodiments, elements a (if present), b, c, d (if present), and f of the 
transgene are from the same gene; the elements a (if present), b, c, d (if present), and f of the 
25 transgene are from two or more genes. For example, the signal sequence, the promoter 

sequence and the 3' untranslated sequence can be from a mammalian gene, e.g., a mammary 
epithelial specific gene, e.g., a milk serum protein or casein gene (e.g., a P casein gene). 
Preferably, the signal sequence, the promoter sequence and the 3* untranslated sequence are 

from a goat p casein gene. 
30 In preferred embodiments, the promoter of the transgene is a mammary epithelial 

specific promote^ e.g., a milk serum protein or casein promoter (e.g., a p casein promoter). 
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The milk specific promoter can be a casein promoter, beta lactoglobulin promoter, whey 
acid protein promoter, or lactalbumin promoter. Preferably, the promoter is a goat 0 casein 
promoter. 

In preferred embodiments, the signal sequence encoded by the transgene is an amino 
5 terminal sequence which directs the expression of the protein to the exterior of a cell, or into 
the cell membrane. Preferably, the signal sequence is from a milk specific protein, or an 
immunoglobulin. Preferably, the signal sequence directs secretion of the encoded fusion 
protein into the milk of a transgenic animal, e.g., a transgenic mammal. 

In preferred embodiments, the one or more nucleotide sequences encoding a fusion 
10 protein include one or more of: a nucleotide sequence encoding an immunoglobulin heavy 
chain (or an antigen binding portion thereof) fused to an enzyme; a nucleotide sequence 
encoding an immunoglobulin light chain (or an antigen binding portion thereof), or both. In 
one embodiment, the nucleotide sequences encoding the heavy chain fusion and the light 
chain are operatively linked in a single construct, e.g., a single cosmid. In another 
15 embodiment, the nucleotide sequences encoding the heavy chain fusion and the light chain 
are introduced into a transgenic animal in separate constructs. Preferably, when linked, the 
nucleotide sequences are arranged in the following order: 
5>-Nl-3' linked to 5'-N2-3'; or 5'-N2-3' linked to 5*-Nl-3' whereinNl is an 
immunoglobulin heavy chain (or an antigen binding portion thereof) linked to an enzyme; 
20 and N2 is an immunoglobulin light chain (or an antigen binding portion thereof). The 

nucleotide sequences can be in any orientation with respect to each other, e.g., sense/sense; 
reverse/reverse; sense/reverse; or reverse/sense. 

In preferred embodiments, the 3' untranslated region of the transgene includes a 
polyadcnylation site, and is obtained from a mammalian gene, e.g., a mammary epithelial 
25 specific gene, (e.g., a milk serum protein gene or casein gene). The 3' untranslated region 
can be obtained from a casein gene (e.g., a p casein gene), a beta lactoglobulin gene, whey 
acid protein gene, or lactalbumin gene. Preferably, the 3' untranslated region is from a goat 
P casein gene. 



30 
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In another aspect, the invention features a nucleic acid molecule encoding a fusion 
protein, e.g., a fusion protein as described herein. 

In preferred embodiments, nucleic acid has a nucleotide sequence as shown in 
Figure IB (SEQ ID NO:l), Figure 2 (SEQ ID NO:37), Figure 4B (SEQ ID NO:5), or Figure 

5 5 (SEQ ID NO:12); the nucleic acid has a nucleotide sequence at least 60%, 70%, 75%, 
more preferably at least 85%, more preferably at least 90%, more preferably at least 95%, 
most preferably at least 98%, 99% sequence identity or homology with a nucleotide 
sequence shown in Figure IB (SEQ ID NO: 1), Figure 2 (SEQ ID NO:37), Figure 4B (SEQ 
ID NO:5), or Figure 5 (SEQ ID NO:12); the nucleic acid has a nucleotide sequence that is 

1 0 capable of hybridizing under stringent conditions to the nucleotide sequence shown in 
Figure IB, Figure 2, Figure 4B, or Figure 5. 

In a preferred embodiment, the nucleic acid has a nucleotide sequence which 
encodes an amino acid sequence as shown in Figure 1 A (SEQ ID NOs:2, 3, 4), Figure 4B 
(SEQ ID NO:6, 7, 8, 9, 10, 1 1), or Figure 5 (SEQ ID NO:13, 14, 15, 16); the nucleic acid 

15 has a nucleotide sequence which encodes an amino acid sequence which is at least 60%, 
70%, 75%, more preferably at least 85%, more preferably at least 90%, more preferably at 
least 95%, most preferably at least 98%, 99% sequence identity or homology with an amino 
acid sequence from Figure 1 A (SEQ ID NO:2 ( 3, 4), Figure 4B (SEQ ID NO:6, 7, 8, 9, 10, 
1 1), or Figure 5 (SEQ ID NO:13, 14, 15, 16). 



In another aspect, the invention features a host cell, e.g., an isolated host cell (e.g., a 
cultured cell), which includes a nucleic acid of the invention (e.g., a nucleic acid, or a 
transgene, e.g., a nucleic acid construct, as described herein). 



In another aspect, the invention features, a fusion protein described herein, or a 
purified preparation thereof 
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In another aspect, the invention features, a pharmaceutical or nutraceutical 
composition having a therapeutically effective amount of a fusion protein, e.g., a fusion 
protein as described herein, and a pharmaceutical^ acceptable carrier. 

In a preferred embodiment, the composition includes milk. 

5 

In another aspect, the invention features, a transgenic animal which includes a 
transgene that encodes a fusion protein, e.g., a transgene which encodes a fusion protein 
described herein. 

10 Preferred transgenic animals include: mammals; birds; reptiles; marsupials; and 

amphibians. Suitable mammals include: ruminants; ungulates; domesticated mammals; and 
dairy animals. Particularly preferred animals include: mice, goats, sheep, camels, rabbits, 
cows, pigs, horses, oxen, and llamas. Suitable birds include chickens, geese, and turkeys. 
Where the transgenic protein is secreted into the milk of a transgenic animal, the animal 

15 should be able to produce at least 1, and more preferably at least 10, or 100, liters of milk 
per year. Preferably, the transgenic animal is a ruminant, e.g., a goat, cow or sheep. Most 
preferably, the transgenic animal is a goat 

In preferred embodiments, the transgenic mammals has germ cells and somatic cells 
containing a transgene that encodes a fusion protein, e.g., a transgene which encodes a 

20 fusion protein described herein. 

In preferred embodiments, the fusion protein expressed in the transgenic animal is 
under the control of a mammary gland specific promoter, e.g., a milk specific promoter, 
e.g., a milk serum protein or casein promoter. The milk specific promoter can be a casein 
promoter, beta lactoglobulin promoter, whey acid protein promoter, or lactalbumin 

25 promoter. Preferably, the promoter is a goat p casein promoter. 

In preferred embodiments, the transgenic animal is a mammal, and the fusion 
protein is secreted into the milk of the transgenic animal at concentrations of at least about 
0.1 mg/ml, 0.5 mg/ml, 1 .0 mg/ml, 1 .5 mg/ml, 2 mg/ml, 3 mg/ml, 5 mg/ml or higher. 
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In another aspect, the invention features, a method of making a transgenic organism 
which has a fusion protein transgene. The method includes providing or forming in a cell of 
an organism, a fusion protein, e.g., a transgene which encodes a fusion protein described 
herein; and allowing the cell, or a descendent of the cell, to give rise to a transgenic 
5 organism. 

In a preferred embodiment, the transgenic organism is a transgenic plant or animal. 
Preferred transgenic animals include: mammals; birds; reptiles; marsupials; and 
amphibians. Suitable mammals include: ruminants; ungulates; domesticated mammals; and 
dairy animals. Particularly preferred animals include: mice, goats, sheep, camels, rabbits, 
10 cows, pigs, horses, oxen, and llamas. Suitable birds include chickens, geese, and turkeys. 
Where the transgenic protein is secreted into the milk of a transgenic animal, the animal 
should be able to produce at least 1, and more preferably at least 10, or 100, liters of milk 
per year. 

In preferred embodiments, the fusion protein is under the control of a mammary 
1 5 gland specific promoter, e.g., a milk specific promoter, e.g., a milk serum protein or casein 
promoter. The milk specific promoter can be a casein promoter, beta lactoglobulin 
promoter, whey acid protein promoter, or lactalbumin promoter. Preferably, the promoter is 
a goat (J casein promoter. 

In preferred embodiments, the organism is a mammal, and the fusion protein is 
20 secreted into the milk of the transgenic animal at concentrations of at least about 0.1 mg/ml, 
0.5 mg/ml, 1 .0 mg/ml, 1 .5 mg/ml, 2 mg/ml, 3 mg/ml, 5 mg/ml or higher. 

In another aspect, the invention features, a method of selectively killing an aberrant 
25 or diseased cell which expresses on its surface a target antigen, e.g., a cancer cell expressing 
a cell suface antigen. The method includes: 

contacting said aberrant or diseased cell with an effective amount of a fusion 
protein, e.g., a fusion protein described herein, wherein either the first or the second 
member of the fusion protein recognizes said target antigen, such that selective killing of the 
30 cell occurs. 
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The subject method can be used on cells in culture, e.g. in vitro or ex vivo (e.g., 
cultures comprising cancer cells). For example, cells can be cultured in vitro in culture 
medium and the contacting step can be effected by adding the fusion protein of the 
invention to the culture medium. Alternatively, the method can be performed on cells (e.g., 
cancer cells) present in a subject, e.g., as part of an in vivo (e.g., therapeutic or prophylactic) 
protocol. 



In another aspect, the invention features, a method of selectively killing an aberrant 
10 or diseased cell which expresses on its surface a target antigen, e.g., a cancer cell expressing 
a cell suface antigen. The method includes: 

introducing into said aberrant or diseased cell a nucleic acid encoding a fusion 
protein, e.g., a fusion protein described herein, wherein either the first or the second 
member of the fusion protein recognizes said target antigen, such that selective killing of the 
15 cell occurs. 

The subject method can be used on cells in culture, e.g. in vitro or ex vivo (e.g., 
cultures comprising cancer cells). For example, cells can be cultured in vitro in culture 
medium and the nucleic acids of the invention can be introduced to the culture medium. 
Alternatively, the method can be performed on cells (e.g., cancer cells) present in a subject, 
20 e.g., as part of an in vivo (e.g., therapeutic or prophylactic) gene therapy protocol. 



In another aspect, the invention provides, a method of treating in a subject, a 
disorder characterized by aberrant growth or activity of a cell which expresses on its surface 
25 a target antigen, e.g., a cancer cell expressing a target antigen. The method includes 
administering to the subject an effective amount of a fusion protein, or a nucleic acid 
encoding a fusion protein (e.g., a fusion protein described herein), wherein either the first or 
the second member of the fusion protein recognizes said target antigen. 

In a preferred embodiment, the disease is characterized by aberrant growth or 
30 activity of a cell, e.g., cancer cell, an immune cell. 
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In yet another aspect, the present invention provides a method for detecting in vitro 
or in vivo the presence of target antigen in a sample, e.g., for diagnosing a disease. The 
method comprises (i) contacting a sample or a control sample under conditions that allow 
interaction of a labelled fusion protein, e.g.. a fusion protein as described herein, and (ii) 
5 detecting formation of a complex. A statistically significant change in the formation of the 
' complex between the fusion protein antibody and the target antigen with respect to a control 
sample is indicative the presence of target antigen in the sample. 

In preferred embodiments, the second member is an enzyme, e.g., horseradish 
peroxidase. 

10 The invention features fusion proteins in which the ability of a first member of the 

fusion to from a multimer is chosen so as to optimize a characteristic, e.g., activity or 
solubility, of the second member. 

The terms peptides, proteins, and polypeptides are used interchangeably herein. 

1 5 A purified preparation, substantially pure preparation of a polypeptide, or an isolated 

polypeptide as used herein, means a polypeptide that has been separated from at least one 
other protein, lipid, or nucleic acid with which it occurs in the cell or organism which 
expresses it, e.g., from a protein, lipid, or nucleic acid in a transgenic animal or in a fluid, 
e.g., milk, or other substance, e.g., an egg, produced by a transgenic animal. The 

20 polypeptide is preferably separated from substances, e.g., antibodies or gel matrix, e.g., 
polyacrylamide, which are used to purify it. The polypeptide preferably constitutes at least 
10, 20, 50 70, 80 or 95% dry weight of the purified preparation. Preferably, the preparation 
contains: sufficient polypeptide to allow protein sequencing; at least 1, 10, or 100 ug of the 
polypeptide; at least 1 , 10, or 100 mg of the polypeptide. 

25 A substantially pure nucleic acid, is a nucleic acid which is one or both of: not 

immediately contiguous with either one or both of the sequences, e.g., coding sequences, 
with which it is immediately contiguous (i.e., one at the 5' end and one at the 3' end) in the 
naturally-occurring genome of the organism from which the nucleic acid is derived; or 
which is substantially free of a nucleic acid sequence with which it occurs in the organism 

30 from which the nucleic acid is derived. The term includes, for example, a recombinant 
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DNA which is incorporated into a vector, e.g., into an autonomously replicating plasmid or 
virus, or into the genomic DMA of a prokaryote or eukaryote, or which exists as a separate 
molecule (e.g., a cDNA or a genomic DNA fragment produced by PCR or restriction 
endonuclease treatment) independent of other DNA sequences. Substantially pure DNA 
5 also includes a recombinant DNA which is part of a hybrid gene encoding additional fusion 

I 

protein sequence. 

Homology, or sequence identity, as used herein, refers to the sequence similarity 
between two polypeptide molecules or between two nucleic acid molecules. When a 
position in the first sequence is occupied by the same amino acid residue or nucleotide as 
1 0 the corresponding position in the second sequence, then the molecules are homologous at 
that position (i.e., as used herein amino acid or nucleic acid "homology" is equivalent to 
amino acid or nucleic acid "identity"). The percent homology between the two sequences 
is a function of the number of identical positions shared by the sequences (i.e., % homology 
= # of identical positions/total # of positions x 100). 
1 5 For example, if 6 of 1 0, of the positions in two sequences are matched or homologous then 
the two sequences are 60% homologous or have 60% sequence identity. By way of 
example, the DNA sequences ATTGCC and TATGGC share 50% homology or sequence 
identity. Generally, a comparison is made when two sequences are aligned to give 
maximum homology or sequence identity. 
20 The comparison of sequences and determination of percent homology between two 

sequences can be accomplished using a mathematical algorithm. A preferred, non-limiting 
example of a mathematical algorithm utilized for the comparison of sequences is the 
algorithm of Karlin and Altschul (1990) Proc. Natl Acad Sci. USA 87:2264-68, modified 
as in Karlin and Altschul (1 993) Proc. Nail Acad Set USA 90:5873-77. Such an 
25 algorithm is incorporated into the NBLAST and XBLAST programs (version 2.0) of 
Altschul, et al. (1990) J. Mol Biol 215:403-10. BLAST nucleotide searches can be 
performed with the NBLAST program, score = 100, wordlength = 12 to obtain nucleotide 
sequences homologous to ITALY nucleic acid molecules of the invention. BLAST protein 
searches can be performed with the XBLAST program, score = 50, wordlength = 3 to obtain 
30 amino acid sequences homologous to ITALY protein molecules of the invention. To obtain 
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gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in 
Altschul et al., (1997) Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST and 
Gapped BLAST programs, the default parameters of the respective programs (e.g., 
XBLASTandNBLAST)canbeused. See http://www.ncbi.nlm.nih.gov. Another 

5 preferred, non-limiting example of a mathematical algorithm utilized for the comparison of 
sequences is the algorithm of Myers and Miller, CABIOS (1989). Such an algorithm is 
incorporated into the ALIGN program (version 2.0) which is part of the GCG sequence 
alignment software package. When utilizing the ALIGN program for comparing amino acid 
sequences, a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 

10 4 can be used. 

As used herein, the term transgene means a nucleic acid sequence (encoding, e.g., 
one or more fusion protein polypeptides), which is introduced into the genome of a 
transgenic organism. A transgene can include one or more transcriptional regulatory 
sequences and other nucleic acid, such as introns, that may be necessary for optimal 
15 expression and secretion ofa nucleic acid encoding the fusion protein. A transgene can 
include an enhancer sequence. A fusion protein sequence can be operatively linked to a 
tissue specific promoter, e.g., mammary gland specific promoter sequence that results in the 
secretion of the protein in the milk of a transgenic rnarnmal, a urine specific promoter, or an 
egg specific promoter. 

20 As used herein, the term "transgenic cell" refers to a cell containing a transgene. 

A transgenic organism, as used herein, refers to a transgenic animal or plant 
As used herein, a "transgenic animal" is a non-human animal in which one or more, 
and preferably essentially all, of the cells of the animal contain a transgene introduced by 
way of human intervention, such as by transgenic techniques known in the art The 
25 transgene can be introduced into the cell, directly or indirectly by introduction into a 

precursor of the cell, by way of deliberate genetic manipulation, such as by microinjection 
or by infection with a recombinant virus. 

Mammals are defined herein as all animals, excluding humans, that have mammary 

glands and produce milk. 
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As used herein, a "dairy animal" refers to a milk producing non-human animal 
which is larger than a rodent. In preferred embodiments, the dairy animal produce large 
volumes of milk and have long lactating periods, e.g., cows or goats. 

As used herein, the language "subject" includes human and non-human animals. 
The term "non-human animals" of the invention includes vertebrates, e.g., mammals and 
non-mammals, such as non-human primates, ruminants, birds, amphibians, reptiles and 
rodents, e.g., mice and rats. The term also includes rabbits. 

As used herein, a "transgenic plant" is a plant, preferably a multiplied or higher 
plant, in which one or more, and preferably essentially all, of the cells of the plant contain a 
I o transgene introduced by way of human intervention, such as by transgenic techniques 
known in the ait 

As used herein, the term "plant" refers to either a whole plant, a plant part, a plant 
cell or a group of plant cells. The class of plants which can be used in methods of the 
invention is generally as broad as the class of higher plants amenable to transformaton 
15 tacta^fachdfagbot Itincludesplants 
of a variety of ploidy levels, including polyploid, diploid and haploid. 

As used herein, the terms '•immunoglobulin" and "antibody" refer to a glycoprotem 
comprising at least two heavy (H) chains and two light (L) chains inter-connected by 
disulfide bonds. Each heavy chain is comprised of a heavy chain variable region 
20 (abbreviatedheremasHCVRorVHiandaheavychainconstantregion. The heavy chain 
constant region is comprised of three domains, CHI. CH2 and CH3. Each light chain is 
comprised of a light chain variable region (abbreviated herein as LCVR or VL) and a light 
chain constant region. The light chain constant region is comprised of one domam, CL. 
The VH and VL regions can be further subdivided into regions of hypervariabihty, termed 
25 complementarity determining regions (CDR), interspersed with regions that are more 

conserved, termed framework regions (FR). Each VH and VL is composed of three CDRs 
and four FRs, arranged from ammo-terminus to carboxy-terminus in the followmg order: 
FR1 CDR1 FR2,CDR2,FR3,CDR3,FR4. The variable regions of the heavy and light 
chains contain a binding domain that interacts with an antigen. The constant regions of the 
30 antibodies may mediate the binding of the immunoglobulin to host tissues or factors, 
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including various cells of the immune system (e.g., effector cells) and the first component 
(Clq) of the classical complement system. 

The term "antigen-binding portion" of an antibody (or simply "antibody portion"), as 
used herein, refers to one or more fragments of an antibody that retain the ability to 
5 specifically bind to an antigen (e.g. a target antigen). It has been shown that the antigen- 
binding function of an antibody can be performed by fragments of a full-length antibody. 
Examples of binding fragments encompassed within the term "antigen-binding portion" of 
an antibody include (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, 
CL and CHI domains; (ii) a F(ab') 2 fragment, a bivalent fragment comprising two Fab 
10 fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of 
the VH and CHI domains; (iv) aFv fragment consisting of the VL and VH domains of a 
single arm of an antibody, (v) a dAb fragment (Ward et al, (1989) Nature 341=544-546), 
which consists of a VH domain; and (vi) an isolated complementarity determining region 
(CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded 
15 for by separate genes, they can be joined, using recombinant methods, by a synthetic linker 
that enables them to be made as a single protein chain in which the VL and VH regions pair 
to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) 
Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). 
Such single chain antibodies are also intended to be encompassed within the term "antigen- 
20 binding portion" of an antibody. These antibody fragments are obtained using conventional 
techniques known to those with skill in the art, and the fragments are screened for utility in 
the same manner as are intact antibodies. 

The term "monoclonal antibody" as used herein refers to an antibody molecule of 
' single molecular composition. A monoclonal antibody composition displays a single 
25 binding specificity and affinity for a particular epitope. Accordingly, the term "human 
monoclonal antibody" refers to antibodies displaying a single binding specificity which 
have variable and constant regions derived from human germline immunoglobulin 
sequences. In one embodiment, the human monoclonal antibodies are produced by a 
hybridoma which includes a B cell obtained from a transgenic non-human animal, e.g., a 
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transgenic mouse, having a genome comprising a human heavy chain transgene and a light 
chain transgene fused to an immortalized cell. 

The term "recombinant human antibody", as used herein, is intended to include all 
human antibodies that are prepared, expressed, created or isolated by recombinant means, 
5 such as antibodies isolated from an animal (e.g., a mouse) that is transgenic for human 
immunoglobulin genes; antibodies expressed using a recombinant expression vector 
transfected into a host cell, antibodies isolated from a recombinant, combinatorial human 
antibody library, or antibodies prepared, expressed, created or isolated by any other means 
that involves splicing of human immunoglobulin gene sequences to other DNA sequences. 
1 0 Such recombinant human antibodies have variable and constant regions derived from 
human germline immunoglobulin sequences. In certain embodiments, however, such 
recombinant human antibodies are subjected to in vitro mutagenesis (or, when an animal 
transgenic for human Ig sequences is used, in vivo somatic mutagenesis) and thus the amino 
acid sequences of the VH and VL regions of the recombinant antibodies are sequences that, 
1 5 while derived from and related to human germline VH and VL sequences, may not naturally 
exist within the human antibody germline repertoire in vivo. 

A nucleic acid is "operably linked" when it is placed into a functional relationship 
with another nucleic acid sequence. For instance, a promoter or enhancer is operably linked 
to a coding sequence if it affects the transcription of the sequence. With respect to 
20 transcription regulatory sequences, operably linked means that the DNA sequences being 
linked are contiguous and, where necessary to join two protein coding regions, contiguous 

and in reading frame. 

The terms "vector" or "construct", as used herein, is intended to refer to a nucleic 
acid molecule capable of transporting another nucleic acid to which it has been linked. One 

25 type of vector is a "plasmid", which refers to a circular double stranded DNA loop into 
which additional DNA segments may be ligated. Another type of vector is a viral vector, 
wherein additional DNA segments may be ligated into the viral genome. Certain vectors 
are capable of autonomous replication in a host cell into which they are introduced (e.g., 
bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). 

30 Other vectors (e.g., non-episomal mammalian vectors) can be integrated into the genome of 
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a host cell upon introduction into the host cell, and thereby are replicated along with the 
host genome. Moreover, certain vectors are capable of directing the expression of genes to 
which they are operatively linked. Such vectors are referred to herein as "recombinant 
expression vectors" (or simply, "expression vectors"). In general, expression vectors of 
5 utilityinrecombinantDNAtechmquesareoftenintheformofplasmids. Inthepresent 
specification, "plasmid" and "vector" may be used interchangeably as the plasmid is the 
most commonly used form of vector. However, the invention is intended to include such 
other forms of expression vectors, such as viral vectors (e.g., replication defective 
retroviruses, adenoviruses and adeno-associated vectors. 
10 The term "recombinant host cell" (or simply "host cell"), as used herein, is intended 

to refer to a cell into which a recombinant expression vector has been introduced. It should 
be understood that such terms are intended to refer not only to the particular subject cell but 
to the progeny of such a cell. Because certain modifications may occur in succeeding 
generations due to either mutation or environmental influences, such progeny may not, in 
15 fact, be identical to the parent cell, but are still included within the scope of the term "host 
cell" as used herein. 

Other features and advantages of the invention will be apparent from the following 
detailed description, and from the claims. 

20 Detailed Description 

The drawings are first described. 

Figure 1A is a schematic diagram of a construct containing the genomic sequence of 
the light chain (LC) of humanized anti-carcinoembryonic antigen antibody 431. The 
location of the signal peptide sequence (s) and the light chain variable (Vk) and the Ck 
25 regions are also indicated. The location of the restriction enzyme sites is also indicated. 

Figure IB depicts the nucleotide and amino acid sequence for the light chain of 
humanized anti-carcinoembryonic antigen antibody 431. The location of the restriction 

enzyme sites is indicated. 

Figure 2 depicts the nucleotide sequence for a Sal I insert containing the coding 
30 sequences for light chain of humanized anti-carcinoembryonic antigen antibody 431. 
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Figure 3 is a schematic diagram of a construct (Be 458) which includes the Sal I 
insert containing the coding sequences for light chain of humanized anti-carcinoembryonic 
antigen antibody 431. Also indicated is the location of the silencer, 5* p-casein untranslated 
region, the light chain coding region, and the 3' P-casein untranslated region. 

5 Figure 4A is a schematic diagram of a construct containing the genomic sequence of 

the heavy chain (HQ of humanized anti-carcinoembryonic antigen antibody 431 linked to 
the p-glucuronidase sequence. The location of the signal peptide sequence (s) and the 
heavy chain variable (Vh) and CHI are also indicated. The location of the restriction 
enzyme sites is also indicated. 

10 Figure 4B depicts the nucleotide and amino acid sequence for the heavy chain of 

humanized anti-carcinoembryonic antigen antibody 431. The location of the restriction 

enzyme sites is indicated. 

Figure 5 depicts the nucleotide and amino acid sequence for the mutant heavy chain 
of humanized anti-carcinoembryonic antigen antibody 43 1. The mutant heavy chain lacks 
15 the hinge region. The location of the restriction enzyme sites is indicated. 

Figure 6 is a schematic diagram of a construct (Be 454) containing the mutant heavy 
chain of humanized anti-carcinoembryonic antigen antibody 43 1 linked to the P- 
glucuronidase sequence. The location of the silencer, 5' p-casein untranslated region, the 
heavy chain mutant/p-glucuronidase fusion coding region, and the 3' p-casein untranslated 
20 region. The location of the restriction enzyme sites is also indicated. 

Figure 7 is an overview of the construction of the heavy chain mutants. 

Figure 8 is an enlarged view of the mutations to p-glucuronidase 

The present invention provides, at least in part, transgenically produced fusion 
25 proteins wherein one member of the fusion protein assembles into a multimer and the other 
member is chosen, or modified, to promote assembly into the optimal number of subunits. 
In one embodiment, the fusion protein includes an immunoglobulin subunit (e.g., an 
immunoglobulin heavy or light chain) fused to a toxin (e.g., a subunit of an enzyme). The 
immunogloblulin-enzyme fusion proteins described herein serve to target a cytotoxic agent 
30 (e.g. the enzyme) to an undesirable cell, e.g., a tumor cell. For example, the fusion proteins 
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S L. «. al. IMtJtat N* « UH S, :6851-6855; Bruggeman e. al. «W 

tol ™/7:33^;Tuaillon..al. 1993 MMS 90:3720-3724; Bruggeman e. al. 1991 

taimmo(21:1323-1326). 
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MonoCona. antibodies can also be genoaKd by ■*« -** »— 

86 3833) Ate irr-tnimg an animal with an immunogen a, described above, the 
^ '^^^B^^Uc^ Me^^sen^^for 

^U.eDNA^ceof^vaHab.e^of.^popu,.^ 

10 i[raMBg lob».i D ™^b,^ B .m iM ureofoU g . m e,pr m e rem dPCR Forrnaa.ee, 

^dtor framework 1 (FR1) sequences, as well as primer lo a conserved 3' constant regl0 ° 

^^«M*-*.•».^■ o ^• l< ** ,, * , *^ , ' 

'^r—uveembc^^ 

» US Patent No. 4,683,202; Orlandi, etal. PNAS {\9S9) 86:3833-3837; Sastry et aL, fMS 
"t, ^728-5732; and Hase e, a!. <,«,*- J*™"™ ^'2, 
oftneKandUiahtehains.as^Uasprhne.fortesiena.seo^e. Using vanable 

Redisplay packages. ato^^^T-^^ 

30 expression. 
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The V-gen. library cloned from the irnmumzation-<leriv.d antibody repertoire ean 
be expressed by a population of drspla, packages, preferabiy derived from filam.n,ous 
ptog e . form « antitcdy display itauy. Ideally, the display package comprises a system 
L allows th. sampling of very large variegaUd antibody display libraries, rapid sorting 
5 after each affinity separation round, and easy isolation of the antibody gene from punfied 

librar ies (e.g„ me Pharmacia ****** ««P AtW, Sy«m, catiuog no. 27-9400-01, 
md the Stnnagene Sur/ZAP™ phage display kit cttalog no. 240612), examples of 
memods and reagent, partcularly amenable for us. in generating a variegato) antibody 

a a, Hotel Publication*. WO 92/18619; Dowe, « al. International Publtcation 
No WO 91/17271; Winter «. al. International Publican WO 92/20791; Markland e, .1. 
taternational Publication No. WO 92/15679; Breitling « al. Ir.tem.tior*! Publication WO 

„ International Publication No. WO 92/09690; Ladner e, al. International Publication No. WO 

^0^:725-734; Hawkins e, a!. (1992,./« «W 226:889-896; Clackson « * (1991, 
^352:624.628;Gram..al.(.992)P^89:3576.3580;Garrade,a..(1991) 
20 i ^^?:.373. 1 377;Hooge.bo.me,a,.,.99,,fc^ a8 e S 19:4133^137;^ 

Barbaset al. (1991)™S88:7978-7982. 

In certain embodiments, the V region domains of heavy and light chams can be 
expressed on tit. sane poly^ude, joined by a flexible linker to form , single-ham F. 
fragment, and the scFV gene subsidy cloned into the desired expression vector o. 
25 ph gegenome. As generaUy described in McCafterry era,., (1990, 348:552-554, 

be used to produc a single chain antibody which can render the display package separable 
based on antigen rib*. Isola-edscFV antibodies immunoxe.ctiv.wim the antigen can 
subsidy be formulated into a pharmaceutical preparation for use in me subject m.thod. 
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Once displayed on the surface of a display package (e.g., filamentous phage), the 
antibody library is screened with the target antigen, or peptide fragment thereof, to identify 
and isolate packages that express an antibody having specificity for the target antigen. 
Nucleic acid encoding the selected antibody can be recovered from the display package 
5 (e.g., from the phage genome) and subcloned into other expression vectors by standard 
recombinant DNA techniques. 

Specific antibody molecules with high affinities for a surface protein can be made 
according to methods known to those in the art, e.g, methods involving screening of 
libraries (Ladner, R.C., et al, U.S. Patent 5,233,409; Ladner, R.C, et aL 9 U.S. Patent 
10 5,403,484). Further, the methods of these libraries can be used in screens to obtain binding 
determinants that are mimetics of the structural determinants of antibodies. 

In particular, the Fv binding surface of a particular antibody molecule interacts with 
its target ligand according to principles of protein-protein interactions, hence sequence data 
for Vh and Vl (the latter of which may be of the k or X chain type) is the basis for protein 
1 5 engineering techniques known to those with skill in the art Details of the protein surface 
that comprises the binding determinants can be obtained from antibody sequence 
information, by a modeling procedure using previously determined three-dimensional 
structures from other antibodies obtained from NMR studies or crytallographic data. See 
for example Bajorath, J. and S. Sheriff, 1996, Proteins: Struct., Fund, and Genet. 24 (2), 
20 152-157; Webster, D.M. and A. R. Rees, 1995, "Molecular modeling of antibody- 
combining sites,"in S. Paul, Ed, Methods in Molecular Biol 51 , Antibody Engineering 
Protocols, Humana Press, Totowa, NJ, pp 17-49; and Johnson, G, Wu, T.T. and E.A. 
Kabat, 1995, "Seqhunt: A program to screen aligned nucleotide and amino acid sequences," 
in Methods in Molecular BioUl, op. c/7, pp 1-15. 
25 In one embodiment, a variegated peptide library is expressed by a population of 

display packages to form a peptide display library. Ideally, the display package comprises a 
system that allows the sampling of very large variegated peptide display libraries, rapid 
sorting after each affinity separation round, and easy isolation of the peptide-encoding gene 
from purified display packages. Peptide display libraries can be in, e.g, prokaryotic 
30 organisms and viruses, which can be amplified quickly, are relatively easy to manipulate, 
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and which allows the creation of large number of clones. Preferred display packages 
include, for example, vegetative bacterial cells, bacterial spores, and most preferably, 
bacterial viruses (especially DNA viruses). However, the present invention also 
contemplates the use of eukaryotic cells, including yeast and their spores, as potential 

5 display packages. Phage display libraries are described above. 

Other techniques include affinity chromatography with an appropriate "receptor", 
e.g., a target antigen, followed by identification of the isolated binding agents or ligands by 
conventional techniques (e.g., mass spectrometry and NMR). Preferably, the soluble 
receptor is conjugated to a label (e.g., fluorophores, cblorimetric enzymes, radioisotopes, 

1 0 or luminescent compounds) that can be detected to indicate ligand binding. Alternatively, 
immobilized compounds can be selectively released and allowed to diffuse through a 
membrane to interact with a receptor. 

Combinatorial libraries of compounds can also be synthesized with "tags" to encode 
the identity of each member of the library (see, e.g., W.C. Still et ai 9 International 

1 5 Application WO 94/0805 1 ). In general, this method features the use of inert but readily 
detectable tags, that are attached to the solid support or to the compounds. When an active 
compound is detected, the identity of the compound is determined by identification of the 
unique accompanying tag. This tagging method permits the synthesis of large libraries of 
compounds which can be identified at very low levels among to total set of all compounds 

20 in the library. 

The term modified antibody is also intended to include antibodies, such as 
monoclonal antibodies, chimeric antibodies, and humanized antibodies which have been 
modified by, e.g., deleting, adding, or substituting portions of the antibody. For example, 
an antibody can be modified by deleting the hinge region, thus generating a monovalent 

25 antibody. Any modification is within the scope of the invention so long as the antibody has 
at least one antigen binding region specific. 

Chimeric mouse-human monoclonal antibodies (i.e., chimeric antibodies) can be 
produced by recombinant DNA techniques known in the art. For example, a gene encoding 
the Fc constant region of a murine (or other species) monoclonal antibody molecule is 

30 digested with restriction enzymes to remove the region encoding the murine Fc, and the 
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equivalent portion of a gene encoding a human Fc constant region is substituted, (see 
Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European 
Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; 
Morrison et al., European Patent Application 173,494; Neuberger et al., International 
5 Application WO 86/01533; Cabilly et al. U.S. Patent No. 4,816,567; Cabilly et al., 

European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. 
(1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol 139:3521-3526; Sun et al. (1987) 
PNAS 84:214-218; Nishimura et al., 1987, Cane. Res. 47:999-1005; Wood et al. (1985) 
Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559). 

10 The chimeric antibody can be further humanized by replacing sequences of the Fv 

variable region which are not directly involved in antigen binding with equivalent 
sequences from human Fv variable regions. General reviews of humanized chimeric 
antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207 and by Oi et al., 
1986, BioTechniques 4:214. Those methods include isolating, manipulating, and 

15 expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable 
regions from at least one of a heavy or light chain. Sources of such nucleic acid are well 
known to those skilled in the art and, for example, may be obtained from 7E3, an anti- 
GPIIbIII a antibody producing hybridoma. The recombinant DNA encoding the chimeric 
antibody, or fragment thereof, can then be cloned into an appropriate expression vector. 

20 Suitable humanized antibodies can alternatively be produced by CDR substitution U.S. 
Patent 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al 1988 Science 
239:1534; and Beidleretal. 1988 J. Immunol 141:4053-4060. 

All of the CDRs of a particular human antibody may be replaced with at least a 
portion of a non-human CDR or only some of the CDRs may be replaced with non-human 

25 CDRs. It is only necessary to replace the number of CDRs required for binding of the 
humanized antibody to the Fc receptor. 

An antibody can be humanized by any method, which is capable of replacing at least 
a portion of a CDR of a human antibody with a CDR derived from a non-human antibody. 
Winter describes a method which may be used to prepare the humanized antibodies of the 

30 present invention (UK Patent Application GB 2 1 88638A, filed on March 26, 1 987), the 
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contcnts of which is expressly incorporated by reference. The human CDRs may be 
replaced with non-human CDRs using oligonucleotide site-directed mutagenesis. 

Also within the scope of the invention are chimeric and humanized antibodies in 
which specific amino acids have been substituted, deleted or added. In particular, preferred 
5 humanized antibodies have amino acid substitutions in the framework region, such as to 
' improve binding to the antigen. For example, in a humanized antibody having mouse 

CDRs, amino acids located in the human framework region can be replaced with the amino 
acids located at the corresponding positions in the mouse antibody. Such substitutions are 
known to improve binding of humanized antibodies to the antigen in some instances. 
10 Antibodies in which amino acids have been added, deleted, or subsisted are referred to 
herein as modified antibodies or altered antibodies. 

Target Antigens 

In preferred embodiments, the first member of the fusion proteins of the present 
15 invention is a targeting agent, e.g., a polypeptide having a high affinity for a target, e.g., an 
antibody, a ligand, or an enzyme. Accordingly, the fusion proteins of the invention can be 
used to selectively direct (e.g., localize) the second member of the fusion protein to the 
vicinity of an undesirable cell. 

For example, the first member can be an immunoglobulin that interacts with (e.g., 
20 binds to a target antigen). In certain embodiments, the target antigen is present on the 
surface of a cell, e.g, an aberrant cell such a hyperproliferative cell (e.g., a cancer cell). 
Exemplary target antigens include carcinoembryonic antigen (CEA), TAG-72, her-2/neu, 
epidermal growth factor receptor, transferrin receptor, among others. 

As used herein, "target cell" shall mean any undesirable cell in a subject (e.g., a 
25 human or animal) that can be targeted by a fusion protein of the invention. Exemplary 
target cells include tumor cells, such as carcinoma or adenocarcinoma-derived cells (e.g, 
colon, breast, prostate, ovarian and endometrial cancer cells) (Thor, A. et al. (1997) Cancer 
Res 46: 3118; Soisson A. P. et al (1989) Am. J. Obstet. Gynecol.:125S-63). The term 
"carcinoma" is art recognized and refers to malignancies of epithelial or endocrine tissues 
30 including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary 
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system carcinomas, testicular carcinomas, breast carcinomas, ovarian carcinomas, prostatic 
carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include 
those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and 
ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors 
5 composed of carcinomatous and sarcomatous tissues. An "adenocarcinoma" refers to a 
carcinoma derived from glandular tissue or in which the tumor cells form recognizable 
glandular structures. The term "sarcoma" is art recognized and refers to malignant tumors 
of mesenchymal derivation. 



Production of Fusion Proteins 

The first and second members of the fusion protein can be linked to each other, 
preferably via a linker sequence. The linker sequence should separate the first and second 
members of the fusion protein by a distance sufficient to ensure that each member properly 

1 5 folds into its secondary and tertiary structures. Preferred linker sequences (1 ) should adopt 
a flexible extended conformation, (2) should not exhibit a propensity for developing an 
ordered secondary structure which could interact with the functional first and second 
members, and (3) should have minimal hydrophobic or charged character, which could 
promote interaction with the functional protein domains. Typical surface amino acids in 

20 flexible protein regions include Gly, Asn and Ser. Permutations of amino acid sequences 
containing Gly, Asn and Ser would be expected to satisfy the above criteria for a linker 
sequence. Other near neutral amino acids, such as Thr and Ala, can also be used in the 
linker sequence. 

A linker sequence length of 20 amino acids can be used to provide a suitable 
25 separation of functional protein domains, although longer or shorter linker sequences may 
also be used. The length of the linker sequence separating the first and second members can 
be from 5 to 500 amino acids in length, or more preferably from 5 to 1 00 amino acids in 
length. Preferably, the linker sequence is from about 5-30 amino acids in length. In 
preferred embodiments, the linker sequence is from about 5 to about 20 amino acids, and is 
30 advantageously from about 10 to about 20 amino acids. Amino acid sequences useful as 
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linkers of the fust and second member include, but are not limited to, (SerGly4) y wherein y 
is greater than or equal to 8, or G^SerGly sSer. A preferred linker sequence has the 
formula (SeKHy^. Another preferred linker has the sequence ((Ser-Ser-Ser-Ser-Gly)3- 
Ser-Pro). 

5 The first and second members can be directly fused without a linker sequence. 

Linker sequences are unnecessary where the proteins being fused have non-essential N-or 
C-tenninal amino acid regions which can be used to separate the functional domains and 
prevent steric interference. In preferred embodiments, the C-terminus of first member can 
be directly fused to the N-terminus of second, or viccversa. 

10 

Recombinant Production 

A fusion protein of the invention can be prepared with standard recombinant DNA 
techniques using a nucleic acid molecule encoding the fusion protein. A nucleotide 
sequence encoding a fusion protein can be synthesized by standard DNA synthesis methods. 

1 5 A nucleic acid encoding a fusion protein can be introduced into a host cell, e.g., a 

cell of a primary or immortalized cell line. The recombinant cells can be used to produce 
the fusion protein. A nucleic acid encoding a fusion protein can be introduced into a host 
cell, e.g., by homologous recombination. In most cases, a nucleic acid encoding the 
fusion protein is incorporated into a recombinant expression vector. 

20 The nucleotide sequence encoding a fusion protein can be operatively linked to one 

or more regulatory sequences, selected on the basis of the host cells to be used for 
expression. The term "operably linked" means that the sequences encoding the fusion 
protein compound are linked to the regulatory sequence^) in a manner that allows for 
expression of the fusion protein. The term "regulatory sequence" refers to promoters, 

25 enhancers and other expression control elements (e.g., polyadenylation signals). Such 

regulatory sequences are described, for example, in Goeddel; Gene Expression Technology: 
Methods in Enzymology 185, Academic Press, San Diego, CA (1990), the content of which 
are incorporated herein by reference. Regulatory sequences include those that direct 
constitutive expression of a nucleotide sequence in many types of host cells, those that 

30 direct expression of the nucleotide sequence only in certain host cells (e.g. , tissue-specific 
regulatory sequences) and those that direct expression in a regulatable manner (e.g., only in 
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the presence of an inducing agent). It will be appreciated by those skilled in the art that the 
design of the expression vector may depend on such factors as the choice of the host cell to 
be transformed, the level of expression of fusion protein desired, and the like. The fusion 
protein expression vectors can be introduced into host cells to thereby produce fusion 
5 proteins encoded by nucleic acids. 
1 Recombinant expression vectors can be designed for expression of fusion proteins in 

prokaryotic or eukaryotic cells. For example, fusion proteins can be expressed in bacterial 
cells such as E. coli, insect cells (e.g., in the baculovirus expression system), yeast cells or 
mammalian cells. Some suitable host cells are discussed further in Goeddel, Gene 
10 Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA 
(1990). Examples of vectors for expression in yeast S. cerevisiae include pYepSecl 
(Baldari et al y (1987) EMBO J. 6:229-234), pMFa (Kurjan and Herskowitz, (1982) Cell 
30:933-943), pJRY88 (Schultzef at, (1987) Gene 54:1 13-123), and pYES2 (Invitrogen 
Corporation, San Diego, CA). Baculovirus vectors available for expression of fusion 
15 proteins in cultured insect cells (e.g., Sf 9 cells) include the pAc series (Smith et a/., (1983) 
Mol Cell Biol 3:2156-2165) and the pVL series (Lucklow, V.A., and Summers, M.D., 
(1989) Virolosv 170:31-39). 

Examples of mammalian expression vectors include pCDM8 (Seed, B., (1987) 
Nature 329:840) and pMT2PC (Kaufinan et al (1987), EMBO J. 6:187-195). When used 
20 in mammalian cells, the expression vector's control functions are often provided by viral 
regulatory elements. For example, commonly used promoters are derived from polyoma, 
Adenovirus 2, cytomegalovirus and Simian Virus 40. 

In addition to the regulatory control sequences discussed above, the recombinant 
expression vector can contain additional nucleotide sequences. For example, the 
25 recombinant expression vector may encode a selectable marker gene to identify host cells 
that have incorporated the vector. Moreover, to facilitate secretion of the fusion protein 
from a host cell, in particular mammalian host cells, the recombinant expression vector 
can encode a signal sequence operatively linked to sequences encoding the amino- 
terminus of the fusion protein such that upon expression, the fusion protein is 
30 synthesized with the signal sequence fused to its amino terminus. This signal sequence 
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directs the fusion protein into the secretory pathway of the cell and is then cleaved, 
allowing for release of the mature fusion protein (i.e., the fusion protein without the 
signal sequence) from the host cell. Use of a signal sequence to facilitate secretion of 
proteins or peptides from mammalian host cells is known in the art 

5 Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional 

transformation or transfection techniques. As used herein, the terms "transformation" and 
"transfection" refer to a variety of art-recognized techniques for introducing foreign nucleic 
acid {e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co- 
precipitation, DEAE-dextran-mediated transfection, lipofection, electroporation, 

10 microinjection and viral-mediated transfection. Suitable methods for transforming or 

transfecting host cells can be found in Sambrook et al {Molecular Cloning: A Laboratory 
Manual, 2nd Edition, Cold Spring Harbor Laboratory press (1989)), and other laboratory 
manuals. 

Often only a small fraction of mammalian cells integrate the foreign DNA into their 
15 genome. In order to identify and select these integrants, a gene that encodes a selectable 
marker {e.g, resistance to antibiotics) can be introduced into the host cells along with the 
gene encoding the fusion protein. Preferred selectable markers include those that confer 
resistance to drugs, such as G418, hygromycin and methotrexate. Nucleic acid encoding a 
selectable marker can be introduced into a host cell on the same vector as that encoding the 
20 fusion pr° tein or can be introduced on a separate vector. Cells stably transfected with the 
introduced nucleic acid can be identified by drug selection {e.g. 9 cells that have incorporated 
the selectable marker gene will survive, while the other cells die). 

A recombinant expression vector can be transcribed and translated in vitro, for 
example using T7 promoter regulatory sequences and T7 polymerase. 

25 

Transgenic Mammals 

Methods for generating non-human transgenic animals are described herein. DNA 
constructs can be introduced into the germ line of a mammal to make a transgenic mammal. 
For example, one or several copies of the construct can be incorporated into the genome of a 
30 mammalian embryo by standard transgenic techniques. 
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It is often desirable to express the transgenic protein in the milk of a transgenic 
mammal. Mammals that produce large volumes of milk and have long lactating periods are 
preferred. Preferred mammals are ruminants, e.g., cows, sheep, camels or goats, e.g., goats 
of Swiss origin, e.g., the Alpine, Saanen and Toggenburg breed goats. Other preferred 
5 animals include oxen, rabbits and pigs. 
1 In an exemplary embodiment, a transgenic non-human animal is produced by 

introducing a transgene into the gennline of the non-human animal. Transgcnes can be 
introduced into embryonal target cells at various developmental stages. Different methods 
are used depending on the stage of development of the embryonal target cell. The specific 
10 line(s) of any animal used should, if possible, be selected for general good health, good 
embryo yields, good pronuclei visibility in the embryo, and good reproductive fitness. 

Introduction of the fusion protein transgene into the embryo can be accomplished by 
any of a variety of means known in the art such as microinjection, electroporation, or 
lipofection. For example, a fusion protein transgene can be introduced into a mammal by 
15 microinjection of the construct into the pronuclei of the fertilized mammalian egg(s) to 
cause one or more copies of the construct to be retained in the cells of the developing 
mammal(s). Following introduction of the transgene construct into the fertilized egg, the 
egg can be incubated in vitro for varying amounts of time, or reimplanted into the surrogate 
host, or both. One common method is to incubate the embryos in vitro for about 1-7 days, 
20 depending on the species, and then reimplant them into the surrogate host 

The progeny of the transgenically manipulated embryos can be tested for the 
presence of the construct by Southern blot analysis of a segment of tissue. An embryo 
having one or more copies of the exogenous cloned construct stably integrated into the 
genome can be used to establish a permanent transgenic mammal line carrying the 
25 transgenically added construct. 

Litters of transgenically altered mammals can be assayed after birth for the 
incorporation of the construct into the genome of the offspring. This can be done by 
hybridizing a probe corresponding to the DNA sequence coding for the fusion protein or a 
segment thereof onto chromosomal material from the progeny. Those mammalian progeny 
30 found to contain at least one copy of the construct in their genome are grown to maturity. 
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The female species of these progeny will produce the desired protein in or along with their 
milk. The transgenic mammals can be bred to produce other transgenic progeny useful in 
producing the desired proteins in their milk. 

Transgenic females may be tested for protein secretion into milk, using an art-known 
5 assay technique, e.g., a Western blot or enzymatic assay. 

Other Transgenic Animals 

Fusion protein can be expressed from a variety of transgenic animals. A protocol for 
the production of a transgenic pig can be found in White and Yannoutsos, Current Topics in 

10 Complement Research: 64th Forum in Immunology, pp. 88-94; US Patent No. 5,523,226; 
US Patent No. 5,573,933; PCT Application WO93/25071; and PCT Application 
WO95/04744. A protocol for the production of a transgenic mouse can be found in US 
Patent No. 5,530,177. A protocol for the production of a transgenic rat can be found in 
Bader and Ganten, Clinical and Experimental Pharmacology and Physiology, Supp. 3:S81- 

15 S87, 1996. A protocol for the production of a transgenic cow can be found in Transgenic 
Animal Technology, A Handbook, 1994, ed., Carl A. Pinkcrt, Academic Press, Inc. A 
protocol for the production of a transgenic sheep can be found in Transgenic Animal 
Technology, A Handbook, 1994, ed., Carl A. Pinkert, Academic Press, Inc. A protocol for 
the production of a transgenic rabbit can be found in Hammer et al., Nature 315:680-683, 

20 1 985 and Taylor and Fan, Frontiers in Bioscience 2:d298-308, 1 997. 

Pmrinction of Transgenic Protein in th e Milk of a Transgenic Animal 

Milk Specific Promoters 

25 Useful transcriptional promoters are those promoters that are preferentially activated 

in mammary epithelial cells, including promoters that control the genes encoding milk 
proteins such as caseins, beta lactoglobulin (Clark et al., (1989) BiofT tchnologyV. 487- 
492), whey acid protein (Gorton et al. (1987) Bio/Technology 5: 1 183-1 1 87), and 
lactalbumin (Soulier etal., (1992) FEBS Letts. 2P7.13). The alpha, beta, gamma or kappa 

30 casein gene promoter of any mammalian species can be used to provide mammary 
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expression; a preferred promoter is the goat beta casein gene promoter (DiTullio, (1992) 
Bio/Technology 10:74-77). Milk-specific protein promoter or the promoters that are 
specifically activated in mammary tissue can be isolated from cDNA or genomic sequences. 
Preferably, they are genomic in origin. 
5 DNA sequence information is available for mammary gland specific genes listed 

1 above, in at least one, and often in several organisms. See, e.g., Richards et al., J. Biol. 
Chem. 256, 526-532 (1981) (a-lactalbumin rat); Campbell et al., Nucleic Acids Res. 12, 
8685-8697 (1984) (rat WAP); Jones et al., J. Biol. Chem. 260, 7042-7050 (1985) (rat (J- 
casein); Yu-Lee & Rosen, J. Biol. Chem. 258, 10794-10804 (1983) (rat y-casein); Hall, 
10 Biochem, J. 242, 735-742 (1987) (a-lactalbumin human); Stewart, Nucleic Acids Res. 12, 
389 (1984) (bovine asl and k casein cDNAs); Gorodetsky et al., Gene 66, 87-96 (1988) 
(bovine p casein); Alexander et al., Eur. J. Biochem. 178, 395-401 (1988) (bovine k casein); 
Brignon et al., FEBSLett. 188, 48-55 (1977) (bovine aS2 casein); Jamieson et al., Gene 61, 
85-90 (1987), Ivanov et al., Biol. Chem. Hoppe-Seyler 369, 425-429 (1988), Alexander et 
15 al., Nucleic Acids Res. 1 7, 6739 (1989) (bovine p lactoglobulin); Vilotte et al., Biochimie 
69, 609-620 (1987) (bovine a-lactalbumin). The structure and function of the various milk 
protein genes are reviewed by Merrier & Vilotte, J. Dairy Sci. 76, 3079-3098 (1993) 
(incorporated by reference in its entirety for all purposes). If additional flanking sequence 
are useful in optimizing expression, such sequences can be cloned using the existing 
20 sequences as probes. Mammary-gland specific regulatory sequences from different 

organisms can be obtained by screening libraries from such organisms using known cognate 
nucleotide sequences, or antibodies to cognate proteins as probes. 

Signal Sequences 

25 Useful signal sequences are milk-specific signal sequences or other signal sequences 

which result in the secretion of eukaryotic or prokaryotic proteins. Preferably, the signal 
sequence is selected from milk-specific signal sequences, i.e., it is from a gene which 
encodes a product secreted into milk. Most preferably, the milk-specific signal sequence is 
related to the milk-specific promoter used in the expression system of this invention. The 

30 size of the signal sequence is not critical for this invention. All that is required is that the 
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sequence be of a sufficient size to effect secretion of the desired recombinant protein, e.g., 
in the mammary tissue. For example, signal sequences from genes coding for caseins, e.g., 
alpha, beta, gamma or kappa caseins, beta lactoglobulin, whey acid protein, and lactalbumin 
useful in the present invention. A preferred signal sequence is the goat fJ-casein signal 



are 



5 sequence. 

Signal sequences from other secreted proteins, e.g., immunoglobulins, or proteins 
secreted by liver cells, kidney cell, or pancreatic cells can also be used. 

insulator Sequences 

10 The DNA constructs of the invention further comprise at least one insulator 

sequence. The terms "insulator", "insulator sequence" and "insulator element" are used 
interchangeably herein. An insulator element is a control element which insulates the 
transcription of genes placed within its range of action but which does not perturb gene 
expression, either negatively or positively. Preferably, an insulator sequence is inserted on 
15 either side of the DNA sequence to be transcribed. For example, the insulator can be 

positioned about 200 bp to about 1 kb, 5' from the promoter, and at least about 1 kb to 5 kb 
from the promoter, at the 3' end of the gene of interest The distance of the insulator 
sequence from the promoter and the 3 ' end of the gene of interest can be determined by 
those skilled in the art, depending on the relative sizes of the gene of interest, the promoter 
20 and the enhancer used in the construct. In addition, more than one insulator sequence can 
be positioned 5' from the promoter or at the 3' end of the transgene. For example, two or 
more insulator sequences can be positioned 5' from the promoter. The insulator or 
insulators at the 3' end of the transgene can be positioned at the 3' end of the gene of 
interest, or at the 3'end of a 3' regulatory sequence, e.g., a 3' untranslated region (UTR) or a 

25 3' flanking sequence. 

A preferred insulator is a DNA segment which encompasses the 5' end of the 
chicken P-globin locus and corresponds to the chicken 5' constitutive hypersensitive site as 
described in PCT Publication 94/23046, the contents of which is incorporated herein by 
reference. 



30 
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DNA Constructs 

A fusion protein can be expressed from a construct which includes a promoter 
specific for mammary epithelial cells, e.g., a casein promoter, e.g., a goat beta casein 
promoter, a milk-specific signal sequence, e.g., a casein signal sequence, e.g., a p-casein 
5 signal sequence, and a DNA encoding a fusion protein. 

A construct can also include a 3' untranslated region downstream of the DNA 
sequence coding for the non-secreted protein. Such regions can stabilize the RN A transcript 
of the expression system and thus increases the yield of desired protein from the expression 
system. Among the 3' untranslated regions useful in the constructs of this invention are 
1 0 sequences that provide a poly A signal. Such sequences may be derived, e.g., from the 
SV40 small t antigen, the casein 3' untranslated region or other 3' untranslated sequences 
well known in the art Preferably, the 3' untranslated region is derived from a milk specific 
protein. The length of the 3 1 untranslated region is not critical but the stabilizing effect of its 
poly A transcript appears important in stabilizing the RNA of the expression sequence. 
15 A construct can include a 5* untranslated region between the promoter and the DNA 

sequence encoding the signal sequence. Such untranslated regions can be from the same 
control region from which promoter is taken or can be from a different gene, e.g., they may 
be derived from other synthetic, semi-synthetic or natural sources. Again their specific 
length is not critical, however, they appear to be useful in improving the level of expression. 
20 A construct can also include about 1 0%, 20%, 30%, or more of the N-tenninal 

coding region of a gene preferentially expressed in mammary epithelial cells. For example, 
the N-terminal coding region can correspond to the promoter used, e.g., a goat p-casein N- 

terminal coding region. 

Prior art methods can include making a construct and testing it for the ability to 

25 produce a product in cultured cells prior to placing the construct in a transgenic animal. 

Surprisingly, the inventors have found that such a protocol may not be of predictive value in 
determining if a normally non-secreted protein can be secreted, e.g., in the milk of a 
transgenic animal. Therefore, it may be desirable to test constructs directly in transgenic 
animals, e.g., transgenic mice, as some constructs which fail to be secreted in CHO cells are 

30 secreted into the milk of transgenic animals. 



WO 01/19842 



-42- 



PCT/US00/25558 



Purification from milk 

The transgenic fusion protein can be produced in milk at relatively high 
concentrations and in large volumes, providing continuous high level output of normally 
5 processed peptide that is easily harvested from a renewable resource. There are several 
1 different methods known in the art for isolation of proteins from milk. 

Milk proteins usually are isolated by a combination of processes. Raw milk first is 
fractionated to remove fats, for example, by skimming, centrifugation, sedimentation (H.E. 
Swaisgood, Developments in Dairy Chemistry, I: Chemistry of Milk Protein, Applied 
10 Science Publishers, NY, 1982), acid precipitation (U.S. Patent No. 4,644,056) or enzymatic 
coagulation with rennin or chymotrypsin (Swaisgood, ibid). Next, the major milk proteins 
may be fractionated into either a clear solution or a bulk precipitate from which the specific 
protein of interest may be readily purified. 

USSN 08/648,235 discloses a method for isolating a soluble milk component, such 
15 as a peptide, in its biologically active form from whole milk or a milk fraction by tangential 
flow filtration. Unlike previous isolation methods, this eliminates the need for a first 
fractionation of whole milk to remove fat and casein micelles, thereby simplifying the 
process and avoiding losses of recovery and bioactivity. This method may be used in 
combination with additional purification steps to further remove contaminants and purify 
20 the component of interest. 

PmHnrtion of Transgenic Protein in the Egg * "f a Transgenic Animal 

A fusion protein can be produced in tissues, secretions, or other products, e.g., an 
egg, of a transgenic animal. For example, fusion proteins can be produced in the eggs of a 
25 transgenic animal, preferably a transgenic turkey, duck, goose, ostrich, guinea fowl, 
peacock, partridge, pheasant, pigeon, and more preferably a transgenic chicken, using 
methods known in the art (Sang etal., Trends Biotechnology, 12:415-20, 1994). Genes 
encoding proteins specifically expressed in the egg, such as yolk-protein genes and 
albumin-protein genes, can be modified to direct expression of fusion protein. 

30 
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Egg Specific Promoters 

Useful transcriptional promoters are those promoters that are preferentially activated 
in the egg, including promoters that control the genes encoding egg proteins, e.g., 
ovalbumin, lysozyme and avidin. Promoters from the chicken ovalbumin, lysozyme or 

5 avidin genes are preferred. Egg-specific protein promoters or the promoters that are 
specifically activated in egg tissue can be from cDNA or genomic sequences. Preferably, 
the egg-specific promoters are genomic in origin. 

DNA sequences of egg specific genes are known in the art (see, e.g., Burley et al., 
"The Avian Egg", John Wiley and Sons, p. 472, 1989, the contents of which are 

1 0 incorporated herein by reference). If additional flanking sequence are useful in optimizing 
expression, such sequences can be cloned using the existing sequences as probes. Egg 
specific regulatory sequences from different organisms can be obtained by screening 
libraries from such organisms using known cognate nucleotide sequences, or antibodies to 
cognate proteins as probes. 

15 

Transgenic Plants 

A fusion protein can be expressed in a transgenic organism, e.g., a transgenic plant, 
e.g., a transgenic plant in which the DNA transgene is inserted into the nuclear or plastidic 
genome. Plant transformation is known as the art. See, in general, Methods in Enzymology 

20 Vol 1 53 ("Recombinant DNA Part D") 1987, Wu and Grossman Eds., Academic Press and 
European Patent Application EP 693554. 

Foreign nucleic acid can be introduced into plant cells or protoplasts by several 
methods. For example, nucleic acid can be mechanically transferred by microinjection 
directly into plant cells by use of micropipettes. Foreign nucleic acid can also be transferred 

25 into a plant cell by using polyethylene glycol which forms a precipitation complex with the 
genetic material that is taken up by the cell (Paszkowski et al. (1984) EMBOJ. 3:2712-22). 
Foreign nucleic acid can be introduced into a plant cell by electroporation (Fromm et al. 
(1985) Proc. Natl Acad Set USA 82:5824). In this technique, plant protoplasts are 
electroporated in the presence of plasmids or nucleic acids containing the relevant genetic 

30 construct. Electrical impulses of high field strength reversibly permeabilize biomembranes 
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allowing the introduction of the plasmids. Elcctroporated plant protoplasts reform the cell 
wall, divide, and form a plant callus. Selection of the transformed plant cells with the 
transformed gene can be accomplished using phenotypic markers. 

Cauliflower mosaic virus (CaMV) can be used as a vector for introducing foreign 

5 nucleic acid into plant cells (Hohn et al. (1 982) "Molecular Biology of Plant Tumors," 

1 Academic Press, New York, pp. 549-560; Howell, U.S. Pat. No. 4,407,956). CaMV viral 
DNA genome is inserted into a parent bacterial plasmid creating a recombinant DNA 
molecule which can be propagated in bacteria. The recombinant plasmid can be further 
modified by introduction of the desired DNA sequence. The modified viral portion of the 

1 0 recombinant plasmid is then excised from the parent bacterial plasmid, and used to 
inoculate the plant cells or plants. 

High velocity ballistic penetration by small particles can be used to introduce 
foreign nucleic acid into plant cells. Nucleic acid is disposed within the matrix of small 
beads or particles, or on the surface (Klein et al. (1987) Nature 327:70-73). Although 

1 5 typically only a single introduction of a new nucleic acid segment is required, this method 
also provides for multiple introductions. 

A nucleic acid can be introduced into a plant cell by infection of a plant cell, an 
explant, a meristem or a seed with Agrobacterium tumefaciens transformed with the nucleic 
acid. Under appropriate conditions, the transformed plant cells are grown to form shoots, 

20 roots, and develop further into plants. The nucleic acids can be introduced into plant cells, 
for example, by means of the Ti plasmid of Agrobacterium tumefaciens. The Ti plasmid is 
transmitted to plant cells upon infection by Agrobacterium tumefaciens, and is stably 
integrated into the plant genome (Horsch et al. (1984) "Inheritance of Functional Foreign 
Genes in Plants," Science 233:496-498; Fraley et al. (1983) Proc. Natl Acad Sci. USA 

25 80:4803). 

Plants from which protoplasts can be isolated and cultured to give whole regenerated 
plants can be transformed so that whole plants are recovered which contain the transferred 
foreign gene. Some suitable plants include, for example, species from the genera Fragaria, 
Lotus, Medicago, Onobrychis, Trifolium, Trigonella, Vigna, Citrus, Linum, Geranium, 
30 Manihot, Daucus, Arabidopsis, Brassica, Raphanus, Sinapis, Atropa, Capsicum, 
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Hyoscyaraus, Lycopersicon, Nicotiana, Solanum, Petunia, Digitalis, Majorana, Ciohorium, 
Helianthus, Lactuca, Bromus, Asparagus, Antirrhinum, Hererocallis, Nemesia, 
Pelargonium, Panicum, Pennisetum, Ranunculus, Senecio, Salpiglossis, Cucumis, 
Browaalia, Glycine, Lolium, Zea, Triticum, Sorghum, and Datura. 

5 Plant regeneration from cultured protoplasts is described in Evans et al., "Protoplasts 

Isolation and Culture,'' Handbook of Plant Cell Cultiires 1:124-176 (MacMillan Publishing 
Co. New York 1983); M.R. Davey, "Recent Developments in the Culture and Regeneration 
of Plant Protoplasts," Protoplasts (1983)-Lecture Proceedings, pp. 12-29, (Birkhauser, 
Basal 1983); P.J. Dale, "Protoplast Culture and Plant Regeneration of Cereals and Other 

10 Recalcitrant Crops," Protoplasts (1983>Lecture Proceedings, pp. 31-41, (Birkhauser, Basel 
1983); and H. Binding, "Regeneration of Plants," Plant Protoplasts, pp. 21-73, (CRC Press, 
Boca Raton 1985). 

Regeneration from protoplasts varies from species to species of plants, but generally 
a suspension of transformed protoplasts containing copies of the exogenous sequence is first 

1 5 generated. In certain species, embryo formation can then be induced from the protoplast 
suspension, to the stage of ripening and germination as natural embryos. The culture media 
can contain various amino acids and hormones, such as auxin and cytokinins. It can also be 
advantageous to add glutamic acid and proline to the medium, especially for such species as 
corn and alfalfa. Shoots and roots normally develop simultaneously. Efficient regeneration 

20 will depend on the medium, on the genotype, and on the history of the culture. If these 
three variables are controlled, then regeneration is fully reproducible and repeatable. 

In vegetatively propagated crops, the mature transgenic plants can be propagated by 
the taking of cuttings or by tissue culture techniques to produce multiple identical plants for 
mailing, such as testing for production characteristics. Selection of a desirable transgenic 

25 plant is made and new varieties are obtained thereby, and propagated vegetatively for 

commercial sale. In seed propagated crops, the mature transgenic plants can be self crossed 
to produce a homozygous inbred plant The inbred plant produces seed containing the gene 
for the newly introduced foreign gene activity level. These seeds can be grown to produce 
plants that have the selected phenotype. The inbreds according to this invention can be used 
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to develop new hybrids. In this method a selected inbred line is crossed with another inbred 
line to produce the hybrid. 

Parts obtained from a transgenic plant, such as flowers, seeds, leaves, branches, 
fruit, and the like are covered by the invention, provided that these parts include cells which 
5 have been so transformed. Progeny and variants, and mutants of the regenerated plants are 
1 also included within the scope of this invention, provided that these parts comprise the 
introduced DNA sequences. Progeny and variants, and mutants of the regenerated plants 
are also included within the scope of this invention. 

Selection of transgenic plants or plant cells can be based upon a visual assay, such as 
10 observing color changes (e.g., a white flower, variable pigment production, and uniform 
color pattern on flowers or irregular patterns), but can also involve biochemical assays of 
either enzyme activity or product quantitation. Transgenic plants or plant cells are grown 
into plants bearing the plant part of interest and the gene activities are monitored, such as by 
visual appearance (for flavonoid genes) or biochemical assays (Northern blots); Western 
1 5 blots; enzyme assays and flavonoid compound assays, including spectroscopy, see, 

Harbome et al. (Eds.), (1975) The Flavonoids, Vols. 1 and 2, [Acad. Press]). Appropriate 
plants are selected and further evaluated. Methods for generation of genetically engineered 
plants are further described in US Patent No. 5,283,184, US Patent No. 5, 482,852, and 
European Patent Application EP 693 554, all of which are hereby incorporated by reference. 

20 

Embodiments of the invention are further illustrated by the following examples 
which should not be construed as being limiting. The contents of all cited references 
(including literature references, issued patents, published patent applications, and co- 
pending patent applications) cited throughout this application are hereby expressly 
25 incorporated by reference. 



Examples 1 and 2 below describe the generation of two constructs: a light chain 
construct and a heavy chain/p-glucuronidase fusion constructs. Two plasmids, one 
30 containing a clone of an antibody heavy chain/ human ((^-glucuronidase fusion protein and 
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the other containing kappa light chain sequence were received obtained from Behringwerke 
AG. 



EXAMPLE 1 : Construction of Light Chain fLC) Construct 

5 

The Example describes the generation of a light chain nucleic acid construct using 
the light chain nucleotide sequence from a humanized monoclonal antibody against 
carcinoembryonic antigen (431) subcloned into a mammary specific expression vector 
(Bcl63) and a commercial mammalian expression vector (pcDNA3). 
1 o Briefly, a Hind III -Eco RI fragment containing the light chain sequence was 

subcloned into pGEM3z to facilitate further manipulation. Two mutations were made: 

a) To create a Sal I, Xho I, and Kozak consensus sequence at the beginning of the 
coding region; and 

b) To creation a Sal I site immediately after the termination codon. 

15 

The original construct contained approximately 1300 bases of unknown sequence. 
To remove the unknown sequences, the Gapped Heteroduplex method was used to create a 
Sal I site just after the termination codon. Sac I sites just before the termination codon and 
near the Eco RI site were used to make the gap, which was filled using Klenow fragment, 
20 deoxynucleotides, T4 DNA ligase, and the following oligonucleotide: 

TGT TAG AGGTCGACG CCC CAC (SEQ ID NO:21) 
term Sail 

25 The gapped region (through the termination codon and new Sal I site) was then sequenced 
to confirm that no changes were made in sequence. 

A second Nco I site was found in the unknown sequence that was removed for a 
subsequent step described below. To remove this site, the construct containing the new Sal 
I site was digested with Eco RI, ends filled with Klenow fragment and deoxynucleotides, 

30 and ligated to a Sal I linker, purchased from New England Biolabs following routine 

experimental procedures. This construct containing two Sal I sites was then digested with 
Sal I-and religated, removing the unknown sequence containing the second Nco I site. 
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A Sal 1 site and Kozak consensus sequence were then inserted immediately before 
the initial methionine codon (instead of simply changing the Hind III site) because there 
were several ATG sequences prior to the correct starting codon that could possibly have 
been used as alternative start sites. While these ATG sequences do not seem to be a 
problem in tissue culture, the safest route was to remove this region. These ATG sequences 
were removed by excising the Hind III Nco I site and replacing it with a Hind III -Nco 1 
adapter containing Sal I and Xho I and a Kozak consensus sequence. The replaced region 
was also confirmed by sequencing. 

The sequence changes were as follows: 

The original 5' prime region had the nucleotide sequence (ATG sequences are capitalized; 
ATG corresponding to initial methionine is indicated in bold): 

aagctt ATG aat ATG caaatcctgctc ATG aat ATG caaatcctctga 
j j atctac ATG gtaaatataggtttgtctataccacaaacagaaaaac ATG agat 

cacagttctctctacagttactgagcacacaggacctcacc ATG 
(SEQIDNO:22) 



10 



20 



The original sequence was replace with the following replacement sequence: 

Hind IU Sail Xhol 

AAGCTT fiTCGAC CTCGAG CCACC ATG 

Kozak (consensus sequence) (SEQ ID NO:23) 



25 The Sal I fragment containing the entire coding region of the light chain was then subcloned 
into the Xho I site of Bel 63, a mammary specific expression vector and pcDNA3, a 
commercial mammalian expression vector. Orientation was determined by restriction 
enzyme analysis and/or sequencing. Figure 1 A is a schematic diagram of the light chain 
construct (43 1 A). The nucleotide and amino acid sequences are shown in Figure IB. 

30 Figure 2 depicts the nucleotide sequence for a Sal I insert containing the coding sequences 
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for light chain of humanized anti-carcinoembryonic antigen antibody 43 1 . Shown as Figure 
3 is a schematic diagram of a construct (Be 458) which includes the Sal I insert containing 
the coding sequences for light chain of humanized anti-carcinoembryonic antigen antibody 
43 1 . Also indicated is the location of the silencer, 5' p-casein untranslated region, the light 
chain coding region, and the 3' p-casein untranslated region. 



EXAMPLE 2 : Construction of Heavy Chain/B-Glucuronida se Fusion Construct 
The Example describes the generation of a heavy chain/p-glucuronidase fusion 
10 construct using the heavy chain nucleotide sequence from a humanized monoclonal 
antibody against carcinoembryonic antigen (431) subcloned into a mammary specific 
expression vector (Bel 63) and a commercial mammalian expression vector (pcDNA3). 

The Hind HI -Xba I fragment containing the heavy chain/p-glucuronidase fusion 
sequence was subcloned into pGEM3z to facilitate further manipulation. Three mutations 
15 were made to the coding region of the heavy chain/p-glucuronidase fusion construct: 

a) To create a Sal I, Xho I, and Kozak consensus sequence at the beginning of the 
coding region; 

b) to change the sequence at the internal Sal I site while retaining thecorrect amino 
acid sequence; and 

20 c) to create a Sal 1 site immediately after the termination codon. 

The signal sequence that was used for the light chain was also used for the heavy 
chain. Again, the region between the Hind III and Nco I sites was removed and 
replaced with the same set of oligonucleotides used in the light chain to create a Sal 
I site and Kozak consensus sequence immediately before the initial methionine 

25 codon. (see above). 

The internal Sal I site had to be changed for the purpose of subcloning the fragment 

into a beta casein expression vector. 



30 



original sequence 
new sequence 



Asn Gly Val Asp Thr Leu (SEQ ID NO:24) 
AAT GGG GTCGAC ACG CTA (SEQIDNO:25) 
GTGGAT (SEQEDNO:26) 
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Val Asp 

The 3-prime flanking sequence contained two polyadenylation signal sites and a 
string of 1 6 adenine residues between the translational stop codon and the Xba I site. To 
5 remove these sequences, a Sal I site was inserted just after the stop codon. 

PheThr*** 

original sequence TTT ACT TGA GCA AGA CTG (SEQ ID NO:27) 
new sequence TTT ACT TGA G GTCGA CTG (SEQ ID NO:28) 

10 Sail 

The Gapped Heteroduplex method was used to make the changes above. The 
original plan was to gap the DNA between the Not I and Xba I sites and change the internal 
Sal I site and add the 3-prime Sal I site at the same time. This proved difficult to 

1 5 accomplish so the 3-prime Sal I site was added first and a new gap was made between the 
two Bgl II sites to change the internal Sal I site. The gapped regions were then sequenced in 
entirety to confirm that no changes were made to the sequence. The only difference found 
was in the fourth intron, 1673 bases from the initial ATG. A cytosine was found in both the 
mutated and the original plasmid instead of adenine, as shown in the printed sequence 

20 above. The Sal I fragment containing the entire coding region of the heavy chain - 

glucuronidase fusion protein was then subcloned into the Xho I site of Bcl63, a mammary 
specific expression vector and pcDNA3, a commercial mammalian expression vector. 
Orientation was determined by restriction enzyme analysis and/or sequencing. 
Figure 4A is a schematic diagram of the light chain construct (431 A). The nucleotide and 

25 amino acid sequences are shown in Figure 4B. 



EXAMPLE 3: Generation of Linked Construct 

This Example described the generation of a construct which includes the light chain 
30 and the heavy chain/p- glucuronidase fusion, along with their corresponding upstream and 
downstream beta casein sequences ligated together into a single cosmid. 
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In order to eliminate the possibility of integrating only one chain of a two chain 
protein, such as an antibody, that has been co-injected into mice or other species, both 
chains along with their own corresponding upstream and downstream beta casein sequences 
were ligated together into a single cosmid. 

To achieve this do this supercosl (Stratagene) was modified by inserting the 

following oligonucleotides into the Bam HI site: 



Pvul Pvul 
13 GAT CAC CGATCG TCG ACC CCC TCG AGCGAJ_CGA ...T7 (SEQ ID NO:29) 
10 "" TG GCT AGCAGCTGG GCG AGCTC G CTA GCT ACT AG (SEQ ID NO:30) 

Sal I Xhol 

These modifications create a new supercos plamid, designated supercos 334, with 
unique Sal I and Xho I sites. Pvu I, Not I, and Eco RI sites flank these sites and the Bam HI 
15 site is destroyed. 

The Sal I fragments from Bel 74 or Bel 75, containing the modified light chain and 
heavy chain/p-glucuronidase coding regions within the beta casein 5 -prime and 3-prime 
flanking regions respectively, were inserted into the Xho I site of supercos 334. Three 
clones were isolated and prepared. The orientation was determined by restriction enzyme 
20 analysis. 

clone # name insert orientation 

1 LCI 4 LC reverse 

2 LCI 3 LC sense 
25 11 HC9 HC reverse 

The complementary Sal I fragments from Bcl74 and Bcl75 (used above) were then 
ligated into the Sal I site of the above constructions. (Heavy chain fragment into LC13 and 
LCI 4, light chain fragment into HC9). The resulting ligations were then large enough to 
30 package in-vitro into lambda phage particles (Amersham kit N. 334) and were used to infect 
E coli XL1 Blue. Three versions were generated and one of each of these clones was 
isolated and prepared: 



WO 01/19842 



PCT/USOO/25558 



-52- 

clonc # name insert orientation 

1 Bel 80 HC/LC reverse/reverse 

9 Bel 81 HC/LC sense /sense 

20 Bel 82 LC/HC reverse/reverse 



Although made through two different pathways, Bel 81 and Bel 82 arc essentially 
the same insert when cut away from the vector. When viewed in the sense direction, they 
both contain the heavy chain/p-glucuronidase Sal I cassette followed by and linked to the 
10 light chain Sal I cassette. Each Sal I cassette contains the 5-prime beta casein promoter 
region, the antibody coding region, and the 3-prime beta casein flanking sequence. 

In essence, two species were made: the light chain cassette followed by the heavy 
chain cassette, or the heavy chain cassette followed by the light chain cassette. 

15 EXAMPLE 4: Characterization of the Light Chain and Heaw Chain/b-Glucuronidase 
Constructs 

The manipulated DNA fragments were tested in tissue culture using the pcDNA3 

constructs described above transfected into cos 7 cells using the standard protocol for 
20 Lipofectamine using Opti-MEM (Gibco-BRL). Conditioned media (DMEM +10%FBS) 

was removed after 48 hours and run on a 10 -20% SDS-PAGE gels for Western blotting. 

Western Blots were conducted following standard procedures, Briefly, for the heavy 

chain/beta-glucuronidase, samples were run in triplicate under reducing conditions and 

electroblotted onto nitrocellulose. The nitrocellulose was then cut into three sections and 
25 incubated overnight with each of three monoclonal antibodies: Mab 2 1 49/80, Mab 2 1 56/94, 

and Mab 2156/21 5. The secondary antibody used for detection was from Cappel (cat no. 

55570 ), affinity purified horse radish peroxidase conjugated goat anti-mouse IgG. 

Detection was with the ECL kit from Amersham. Mab 2149/80 was the only antibody that 

showed a signal on the western blot. 
30 For the light chain, samples were again run under reducing conditions and 

electroblotted onto nitrocellulose. The nitrocellulose was then incubated overnight with 
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horse radish peroxidase conjugated goat anti-human Kappa chain antibody (Cappel no. 
55233). Detection was with the ECL kit from Amersham. 

EXAMPLE 5: Production of Transgenic Animals 

5 

* Microinjection fragments were prepared by cutting the beta casein constructs Be 174 

(light chain) and Be 175 (heavy chain) with Sal I to release the bacterial sequences. 
Fragments were gel purified then buffer exchanged and concentrated using the Wizard 
system by Promega. 

10 Microinjections of the original nucleotide sequences were tested in the mouse model 

system using an expression vector containing the goat beta casein upstream and coding 
sequences. Two separate constructions were made and co-injected into mouse embryos, 
from which founder lines were identified and tested further. The original DNA sequences 
were also co-injected with an "insulator" sequence which allows us to produce a higher 

1 5 percentage of high producing animal lines. For example, without the insulator generally 
one in three lines would be a relatively high producer. With the insulator, in many cases, 
almost all of the lines produced are high expressing lines. 
Two sets of injections were carried out as follows: 

For the first set of injections, 1249 embryos were injected of which 838 survived, 
20 and 737 were transferred to pseudopregnant females. From these females 80 live pups were 

born, of which 8 were transgenic, 7 of which carried both chains. 

For the second set of injections, 508 embryos were injected of which 435 survived, 

and 426 were transferred to pseudopregnant females. From these females 44 live pups were 

born of which 2 were transgenic, both of which carried both chains. 
25 Bel 81 was injected over three days. In this set, 840 embryos were injected of which 

641 survived, and 618 were transferred to pseudopregnant females. From these females 39 

live pups were born, of which 5 were transgenic, 3 of which carried both chains. Due to the 

repetition of the flanking beta casein sequences, it appears that in some cases recombination 

occurs deleting one chain or the other. 
30 Bel 8 1 was co injected with the silencer fragment over four days. In this case, 1495 goat 

embryos were injected of which 1 1 83 survived, and 1 073 were transferred to 
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pseudopregnant goat females. From these, 1 1 1 live pups were born and 10 of these were 
transgenic, six carrying both chains. Two of the pups carry both the silencer fragment and 
both antibody chains. 



5 EXAMPLE 5: Generation of Mutants of the Heavy Chain/P-Glucuronidase fu sion Protein 

In an attempt to increase expression of active molecules, two mutations to the heavy 
chain fusion protein were carried out The first mutation was to remove the hinge region of 
the construct. The second mutation removes the hinge and linker sequence (ala-ala-ala-ala- 

1 0 val) (SEQ ID NO:3 1 ) at the beginning of the p-glucuronidase coding sequence, fusing the 
CH2 portion to p glucuronidase. 

To achieve this, gapped heteroduplex mutagenesis was again used. The construct 
Behring HC5 (which contains the fusion protein in pGem3Z with both ends modified and an 
internal Sal I site removed) was linearized with (Xba I). A second aliquot was cut with 

1 5 BstE2 plus Not I. When boiled together and cooled some of each strand anneal forming the 
heteroduplex containing a single stranded gap, in this case between the BstE2 and Not I 
sites. Two new constructs were then made, sequencing over the gapped portion to make 
sure no other mutations were made inadvertently. 

20 GTC #403: using the oligonucleotide "Behr hinge-alternate" (in bold below) removes the 
hinge region and part of the introns immediately preceding and after it 

ccaaactctctactcACTCAGCTCA CGCATCCACQccatcccagatccccgt (SEQIDNO:32) 
intron intron 

25 

GTC #406: using the oligonucleotide "Behr hinge/linker" (in bold below) removes the 
hinge region and ala-ala-ala-ala-val linker, fusing the CH2 and p-glucuronidase coding 
regions. 



30 



agcaacaccaaggtgGACAAGAGAGTT CAGGGCGGGATGctgtacccccaggag 

CH2 coding sequence p-glucuronidase (SEQ ID NO:33) 



5 

I 
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The mutated fusion protein coding sequence can then be excised using Sal I and 
subcloned into an appropriate expression vector. 

High levels of expression of the encoded proteins was obtained with a vector 
consisting of a silencer (or insulator) fragment followed by the goat beta casein promoter, 
insert DNA, and goat beta casein 3 prime untranslated regions. Both mutant heavy chains 
and the light chain have been subcloned into such a vector, Bc450, which is flanked by Sal I 
sites which release the entire injection fragment 



Bc454: Bc450 with heavy chain mutant 403 (minus hinge) 
1 o Bc456: Bc450 with heavy chain mutant 406 (minus hinge/linker) 

Bc458: Bc450 with the light chain 

Figure 5 depicts the nucleotide and amino acid sequence for the mutant heavy chain 
of humanized anti-carcinoembryonic antigen antibody 431 lacking the hinge region. 
1 5 Figure 6 is a schematic diagram of a construct (Be 454) containing the mutant heavy 

chain of humanized anti-carcinoembryonic antigen antibody 431 linked to the fJ- 
glucuronidase sequence. The location of the silencer, 5* p-casein untranslated region, the 
heavy chain mutant/ p -glucuronidase fusion coding region, and the 3' p-casein untranslated 
region. 

20 

EXAMPLE 6: Characterization of Transgenic Animals 

The previous examples describe the testing of the original fusion protein and two 
heavy chain mutants in the milk expression system. The original fusion proteins were tested 
25 both without the insulator and also co injected with a separate insulator fragments. The 
heavy chain mutants, on the other hand, were tested with the insulator integrated into the 
construct. 

Initially, the concentration of the fusion protein produced in milk was estimated by 
comparing the signal of a sample to that of a standard on a Western blot. Later, experiments 
30 measured activity rather than concentration based on Western blots. The activity 
measurements were more accurate. 
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Except for the first set of constructs, Bel 74 + Be 175, estimates of protein 
concentration by Western blot are rough estimates. Generally, lines that express well 
appear to be in the 1-2 mg/ml range. 

Expression data is summarized below, with more detained data sets for each 
construct attached. 



Constructs 


DNA 


Insulator 


Western 
(HC) ug/mi 
estimated 


Maximum 

activity 

ug/ml 


Bcl74/Bcl75 


Original 


no 


-800 


20 


Bcl81 


original, linked 


No 


1000-2000 


Na 


Bel 81 + insulator 


original, linked 


Co injected 
fragment 


-1000 


100 


Bc456/Bc458 


Minus hinge/linker 


yes 


1000-2000 


8 


Bc454/Bc458 


Minus hinge 


yes 


1000-2000 


800 



Essentially the results shown herein indicate that while high levels of protein can be 
made in milk, most of this protein is not active. Such inactivity may be due to a folding 
10 problem or a problem in the assembly of the tetramer. Removal of the hinge and linker also 
produced a protein with low activity. In contast, substantial amount of enzymatic activity 
was achieved upon the removal of the hinge alone. 

Approximately, 8 mg of this protein have been produced in mouse milk. The 
isolated protein is currently being tested in in vivo studies ("human CEA positive colon 
1 5 cancer metastasis model"). 

A summary of the data regarding the mice produced and analysis done follows, in 
table form. 



A. Be 174/1 75 founders 
20 Original DNA without the insulator 
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Line 


2 DB 
Gen. 


Sex 


PCR 

LC HC 


Copy # 

LC HC 


TT C91CJ I* 

ug/inl 


Activity 


2 




F 


+ 


+ 


10 


10 




0.14 


4 




F 


+ 


+ 


1 


1 




<0 1 


10 




F 


+ 


+ 


10 


10 


~R00 


18 




142 


F 


+ 


+ 






efin 




22 




F 


+ 


+ 


10 


10 








154 


F 


+ 


+ 






"*OV/U 




23 




F 


+ 


+ 


50 


50 


A (\(\ 


A 




200 


F 


+ 


+ 






u.u 




40 




M 


+ 


+ 


100 


100 






62 




c 
r 


+ 


+ 


+ 


+ 




<0.1 


81 




M 


+ 








n.a. 




85 




F 


+ 


+ 


25 


25 




.3 


116 




M 


+ 


+ 


5 


5 








216 


F 


+ 








-800 






221 


F 


+ | + 






-800 





n.a. = not analyzed (line carries only one chain) 



10 
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B. Bel 81 founders 



Original DNA without the insulator; a fusion of the Be 174 and Bcl75 injection fragments 









PCR 


Copy* 


Western 

ug/ml 

(approx.) 


Activity 
ug/ml* 


Line 


Fl 


Sex 


LC 


HC 


LC 


HC 


HC 




6 




M 


+ 


+ 


12 


12 








49 


F 










0.0 






50 


F 










0.0 






52 


F 










>1000 


39 




60 


F 










>1000 


41 


25 




F 


+ 


+ 


15 


15 


0.0 




29 




F 




+ 


n.a. 


n.a. 


0.0 




33 




M 




+ 


n.a. 


n.a. 


n.a. 




36 




F 


+ 


+ 


100 


100 


0.0 





n.a. = not analyzed 
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C. Bel 81 + insulator founders 

Original DNA (fusion of the Be 174 and Be 175 injection fragments) co-injected with the 
insulator 









PCR 


Copy * 


Western 

ug/ml 

(approx.) 


Activity 
ug/ml* 


Line 


2 W 
gen. 


Sex 


LC 


HC 


Sil 


LC 


HC 


sil 


HC 




9 




M 


— 


+ 


n.a. 


0 


1 


— 


n.a. 




13 




M 


+ 


— 


n.a. 


2 


0 


— 


n.a. 




15 




F 


+ 


— 


n.a. 


3 


0 


— 


n.a. 




33 




F 




+ 


n.a. 


3 


3 


— 


-800 




40 




M 




+ 


n.a. 


0 


2 




nil. 




58 




M 


+ 


+ 


n.a. 


2 


10 










2-139 


F 


+ 


+ 










-800 


43 




2-140 


F 


+ 


+ 










-800 


31 


66 




F 


+ 


+ 


n.a. 


1 


1 


+ 






78 




F 


+ 


+ 


n.a. 


1 


1 




0.0 




81 




M 


+ 


+ 


n.a. 


1 


1 




Not pass 




90 




M 


+ 


+ 


n.a. 


20 


20 


+ 








2-123 


F 


+ 


+ 


+ 








-1000 


-100 




2-124 


F 


+ 


+ 


+ 








-1000 


60 




2-126 


F 


+ 












Low 


15 



n.a. = not analyzed 
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D. Bc456 + Bc4S8 founders 
Mutation removing the hinge and linker: 



l" Generation 


l oeneranon 


Cat 


Western HC 


Activity up/ml. 


6 




r 


none 


o 


8 




c 
r 




o 

u 


13 




r 


none 


A 

V 






F 




0 


24 




F 


none 


0 


57 




F 


good 


2 


65 




F 


good 


8 


66 




F 


low 


0 


138 




M 








175 


F 






152 




F 


none 


0 



5 
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E. Bc454 + Bc458 

Mutation removing the hinge only 



1 st Generation 


2 Generation 


3 

Generation 


Sex 


Western HC 


Activity 
ug/rai. 


162 died 


c-section 




F 


— 






4 




F 


High 


66 


180 






F 


Good 


177 




11 




F 




133 




12 




F 




185 


182 






M 


— 






15 




F 


Good 


83 


187 






M 


Not passing 
gene 




193 






M 


— 






27 




F 


High 


838 






57 


F 




829 






58 


F 




742 






59 


F 




944 






60 


F 




574 






01 


r 




752 






62 


F 




534 


201 






F 


Good 


416 


215 






F 




(died) 


219 






M 












F 






220 






M 












F 







5 
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Example 7: Generation and Characterization of Transgenic Goats 

The sections outlined below briefly describe the major steps in the production of 
transgenic goats. 

5 

I Goat Species and breeds: 

Swiss-origin goats, e.g., the Alpine, Saanen, and Toggenburg breeds, are preferred 
in the production of transgenic goats. 

10 Goat superovulation: 

The timing of estrus in the donors is synchronized on Day 0 by 6 mg subcutaneous 
norgestomet ear implants (Syncromate-B, CEVA Laboratories, Inc., Overland Park, KS). 
Prostaglandin is administered after the first seven to nine days to shut down the endogenous 
synthesis of progesterone. Starting on Day 13 after insertion of the implant, a total of 1 8 mg 

15 of follicle-stimulating hormone (FSH - Schering Corp., Kenilworth, NJ) is given 

intramuscularly over three days in twice-daily injections. The implant is removed on Day 
14. Twenty-four hours following implant removal the donor animals are mated several 
times to fertile males over a two-day period (Selgrath, et ah, Theriogenology, 1990. pp. 
1195-1205). 

20 

Embryo collection: 

Surgery for embryo collection occurs on the second day following breeding (or 72 
hours following implant removal). Superovulated does are removed from food and water 

36 hours prior to surgery. Does are administered 0.8 mg/kg Diazepam (Valium®) IV, 
25 followed immediately by 5.0 mg/kg Ketamine (Keteset), IV. Halothane (2.5%) is 

administered during surgery in 2 L/min oxygen via an endotracheal tube. The reproductive 
tract is exteriorized through a midline laparotomy incision. Corpora lutea, unruptured 
follicles greater than 6 mm in diameter, and ovarian cysts are counted to evaluate 
superovulation results and to predict the number of embryos that should be collected by 
30 oviductal flushing. A cannula is placed in the ostium of the oviduct and held in place with a 
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single temporary ligature of 3.0 Prolene. A 20 gauge needle is placed in the uterus 
approximately 0.5 cm from the uterotubal junction. Ten to twenty ml of sterile phosphate 
buffered saline (PBS) is flushed through the cannulated oviduct and collected in a Petri dish. 
This procedure is repeated on the opposite side and then the reproductive tract is replaced in 
5 the abdomen. Before closure, 10-20 ml of a sterile saline glycerol solution is poured into 

1 the abdominal cavity to prevent adhesions. The linea alba is closed with simple interrupted 
sutures of 2.0 Polydioxanone or Supramid and the skin closed with sterile wound clips. 

Fertilized goat eggs are collected from the PBS oviductal flushings on a 
stereomicroscope, and are then washed in Ham's F12 medium (Sigma, St Louis, MO) 

1 0 containing 1 0% fetal bovine serum (FBS) purchased from Sigma. In cases where the 
pronuclei are visible, the embryos is immediately microinjected. If pronuclei are not 
visible, the embryos can be placed in Ham's F12 containing 10% FBS for short term culture 
at 37°C in a humidified gas chamber containing 5% C02 in air until the pronuclei become 
visible (Selgrath, et al., Theriogenology, 1990. pp. 1 195-1205). 

15 

Microinjection procedure: 

One-cell goat embryos are placed in a microdrop of medium under oil on a glass 
depression slide. Fertilized eggs having two visible pronuclei are immobilized on a flame- 
polished holding micropipet on a Zeiss upright microscope with a fixed stage using 
20 Normarski optics. A pronucleus is microinjected with the DNA construct of interest, e.g., a 
BC355 vector containing the fusion protein gene operably linked to the regulatory elements 
of the goat beta-casein gene, in injection buffer (Tris-EDTA) using a fine glass microneedle 
(Selgrath, et al., Theriogenology, 1990. pp. 1 195-1205). 

25 Embryo development: 

After microinjection, the surviving embryos are placed in a culture of Ham's F12 
containing 10% FBS and then incubated in a humidified gas chamber containing 5% C02 in 
air at 37°C until the recipient animals are prepared for embryo transfer (Selgrath, et al., 
Theriogenology, 1990. p. 1195-1205). 



30 



WO 01/19842 



PCT/USOO/25558 



-64- 

Preparation of recipients: 

Estrus synchronization in recipient animals is induced by 6 mg norgestomet ear 
implants (Syncromate-B). On Day 13 after insertion of the implant, the animals are given a 
single non-superovulatory injection (400 1.U.) of pregnant mares serum gonadotropin 
5 (PMSG) obtained from Sigma. Recipient females are mated to vasectomized males to 
ensure estrus synchrony (Selgrath, et al. f Theriogenology, 1990. pp. 1 195-1205). 

Embryo Transfer: 

All embryos from one donor female are kept together and transferred to a single 
10 recipient when possible. The surgical procedure is identical to that outlined for embryo 
collection outlined above, except that the oviduct is not cannulated, and the embryos are 
transferred in a minimal volume of Ham's F12 containing 10% FBS into the oviductal 
lumen via the fimbria using a glass micropipet. Animals having more than six to eight 
ovulation points on the ovary are deemed unsuitable as recipients. Incision closure and 
1 5 post-operative care are the same as for donor animals (sec, e.g., Selgrath, et al., 
Theriogenology, 1990. pp. 1195-1205). 

Monitoring of pregnancy and parturition: 

Pregnancy is determined by ultrasonography 45 days after the first day of standing 
20 estrus. At Day 110a second ultrasound exam is conducted to confirm pregnancy and assess 
fetal stress. At Day 130 the pregnant recipient doe is vaccinated with tetanus toxoid and 
Clostridium C&D. Selenium and vitamin E (Bo-Se) are given IM and Ivermectin was given 
SC. The does are moved to a clean stall on Day 145 and allowed to acclimatize to this 
environment prior to inducing labor on about Day 147. Parturition is induced at Day 147 
25 with 40 mg of PGF2a (Lutalyse®, Upjohn Company, Kalamazoo Michigan). This injection 
is given IM in two doses, one 20 mg dose followed by a 20 mg dose four hours later. The 
doe is under periodic observation during the day and evening following the first injection of 
Lutalyse® on Day 147. Observations are increased to every 30 minutes beginning on the 
morning of the second day. Parturition occurred between 30 and 40 hours after the first 
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injection. Following delivery the doe is milked to collect the colostrum and passage of the 
placenta is confirmed. 

Verification of the transgenic nature of F n animals : 
5 To screen for transgenic Fq animals, genomic DNA is isolated from two different 

cell lines to avoid missing any mosaic transgenics. A mosaic animal is defined as any goat 
that does not have at least one copy of the transgene in every cell. Therefore, an ear tissue 
sample (mesoderm) and blood sample are taken from a two day old Fq animal for the 
isolation of genomic DNA (Lacy, et al., A Laboratory Manual, 1986, Cold Springs Harbor, 

1 0 NY; and Herrmann and Frischauf, Methods Enzymology, 1 987. 1 52: pp. 1 80-1 83). The 
DNA samples are analyzed by the polymerase chain reaction (Gould, et al., Proc. Natl. 
Acad. Sci, 1989. 86:pp. 1934-1938) using primers specific for the fusion protein gene and 
by Southern blot analysis (Thomas, Proc Natl. Acad. Sci., 1980. 77:5201-5205) using a 
random primed first member or second member cDNA probe (Feinberg and Vogelstein, 

15 Anal Bioc, 1983. 132: pp. 6-13). Assay sensitivity is estimated to be the detection of one 
copy of the transgene in 10% of the somatic cells. 

Generation and Selection of production herd 

The procedures described above can be used for production of transgenic founder 
20 (Fq) goats, as well as other transgenic goats. The transgenic Fq founder goats, for example, 

are bred to produce milk, if female, or to produce a transgenic female offspring if it is a 
male founder. This transgenic founder male, can be bred to non-transgenic females, to 
produce transgenic female offspring. 

25 Transmission of transgene and pertinent characteristics 

Transmission of the transgene of interest, in the goat line is analyzed in ear tissue 
and blood by PCR and Southern blot analysis. For example, Southern blot analysis of the 
founder male and the three transgenic offspring shows no rearrangement or change in the 
copy number between generations. The Southern blots are probed with immunoglobulin- 

30 enzyme fusion protein cDNA probe. The blots are analyzed on a Betascope 603 and copy 
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number determined by comparison of the transgene to the goat beta casein endogenous 
gene. 

Evaluation of expression levels 

The expression level of the transgenic protein, in the milk of transgenic animals, 
determined using enzymatic assays or Western blots. 



Other embodiments are within the following claims. 
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What is claimed is: 

1 . A method of making a fusion protein having: a first member, fused 
5 to a second member wherein the first and second members are chosen such that the 

' fusion protein assembles into a complex having a number of subunits which 

optimizes activity of the multimeric form of the second member. 

2. The method of claim 1 , wherein the first member, or the fusion protein, 

10 assembles into a form having the same number of subunits as are present in an active 

form of the second member. 

3. The method of claim 1 , wherein the first member includes an Ig subunit. 

1 5 4. The method of claim 1 , wherein the second member, is other than an Ig 

subunit 

5. The method of claim 1 , wherein the first member is has been modified at a 
site which modulates formation or maintenance of a mul timer of subunits. 

20 

6. The method of claim 1 , wherein the first member forms a dimmer. 

7. The method of claim 1, wherein the first member includes an Ig subunit, 
25 which has been modified to inhibit formation of a multimeric form. 

8. The method of claim 7, wherein the modification is a change, insertion, or 
deletion of one or more amino acid residues, and results in a subunit which does not form a 
multimer or which forms a lower order multimer that it normally would form. 



30 
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The method of claim 7, wherein hinge region of the immunoglobulin is 



1 0. The method of claim 7, wherein the modification results in a dimeric Ig 
5 structure. 

1 1 . The method of claim 1 0, wherein the dimer includes a heavy chain fusion 
and a light chain fusion. 

10 12. The method of claim 1 , wherein the second member includes beta- 

glucuronidase. 

13. The method of claim 1 , wherein the first member is an immunoglobulin (Ig) 
heavy of light chain, and the second member is human beta-glucuronidase fusion protein. 

15 

14. The method of claim 1 , wherein the fusion protein is produced in a 
transgenic animal. 

15. A method for providing a transgenically produced fusion protein of claim 1 , 
comprising obtaining milk from a transgenic mammal, which includes a fusion protein 

20 encoding transgene that result in the expression of the protein-coding sequence of fusion 
protein in mammary gland epithelial cells, thereby secreting the fusion protein in the milk of 
the mammal. 

1 6. A nucleic acid construct, which includes: 
25 (a) optionally, an insulator sequence; 

(b) a promoter, e.g., a mammary epithelial specific promoter, e.g., a milk protein 
promoter; 

(c) a nucleotide sequence which encodes a signal sequence which can direct the 
secretion of the fusion protein, e.g. a signal sequence from a milk specific protein, or an 

30 immunoglobulin; 
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(d) optionally, a nucleotide sequence which encodes a sufficient portion of the 
amino terminal coding region of a secreted protein, e.g. a protein secreted into milk, or an 
immunoglobulin, to allow secretion, e.g., in the milk of a transgenic mammal, of the fusion 
protein protein; 

5 (e) one or more nucleotide sequences which encode a fusion protein, e.g., a fusion 

* protein as described herein; and 

(f) optionally, a 3 1 untranslated region from a mammary epithelial specific gene, 
e.g., a milk protein gene. 

10 17. A nucleic acid construct, which includes a nucleic acid molecule encoding a 

fusion protein of claim 1 . 

18. A fusion protein described in claim 1 . 



15 1 9. A transgenic animal which includes a transgene that encodes a fusion protein 

of claim 1. 
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genomic construct of 431 LC in 
pAB Stop 



Ncol 




Hindlll 

1 AAGCTTATGA ATATGCAAAT CCTOCTCATG AATATGCAAA TCCTCTGAAT CTACATGGTA AATATAGGTT 

71 TGTCTATACC ACAAACAGAA AAACATGAGA TCACAGTTCT CTCTACAGTT ACTGAGCACA CAGGACCTCA 

Signal 

Ncol 

141 OC ATG GGA TGG AGC TGT ATC ATC CTC TTC TTG GTA GCA ACA GCT ACA GGTAAGGGGC 
l»M9t Gy Trp Sir Cys I la ll« Lau Pha Lau Val Al a Thr Al a Thr 

198 TCACAGTAGC AGGCTTGAGG TCTGGACATA TATATGGGTG ACAATGACAT CCACTTTGCC TTTCTCTCCA 

268 CA GAC ATC CAG ATG ACC CAG AGC CCA AGC AGC CTG AGC GCC AGC GTG GGT GAC AGA 
l>Asp Ma Gn Mst Thr Qn Sar Pro Sar Sar Lau Sar Ala Sar Vat Gly Asp Arg 
Figure IB 1 

Asp718I 

324 GTG ACC ATC ACC TGT AGT ACC AGC TCG AGT GTA ACT TAC ATG CAC TGG TAC CAG CAG 
19>Val Thr lie Thr Cys Sar Thr Ser Sar Sar Val Sar Tyr Mit His Trp Tyr Qn Gn 



381 AAG 


CCA GGT 


AAG 


GCT CCA 


AAG 


CTG 


CTG 


ATC TAC AGC ACA 


TCC 


AAC 


CTG 


GCT 


TCT 


GGT 


38>Lys 


Pro Q y Lys Al a Pro 


Lyi 


Lau 


Lau 


lla Tyr Ssr Thr 


Sar 


Asn 


Lau 


Ala 


Sar 


Qy 
















Asp718I 












AGC 


438 GTG 


CCA AGC 


AGA 


TTC AGC 


GGT 


AGC 


GGT 


AGC GCT ACC GAC 


TTC 


ACC 


TTC 


ACC 


ATC 


57>Val 


Pro Sar 


A rg 


Pha Sar 


Gly 


Sar 


Gly 


Sar Gty Thr Asp 


Pha 


Thr 


Pha 


Thr 


lla 


Sar 


495 AGC 


CTC CAG 


CCA 


GAG GAC 


ATC 


GCC 


ACC 


TAC TAC TGC CAT 


CAG 


TGG 


AGT 


AGT 


TAT 


CCC 


76^ Sar 


Lau Gin 


Pro 


Gu Asp 


1 la 


Ala 


Thr 


Tyr Tyr Cys Hi s 


Gn 


Trp 


Sar 


Sar 


Tyr 


Pro 


552 ACG 


TTC GGC 


CAA 


GGG ACC 


AAG 


GTG 


GGTGAGTCCT TACAACCTCT CTCTTAGTCT CCTCAGGTGA 


95>Thr 


Pht Qy 


Qn 


Qy Thr 


Lys 


Val 



















616 GTCCTTACAA CCTCTCTCTT CTATTCAGCT TAAATAGATT TTACTGCATT TGTTGGGGGG GAAATGTGTG 



686 TATCTGAATT TCAGGTCATG AAGGACTAGG GACACCTTGG GAGTCAGAAA GGGTCATTGG GAGCCGTGGC 
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756 TGATGCAGAC AGACATCCTC ACCTCCCAGA CCTCATCGCC AGAGMTTAT ACCATCCTTC TAAACTCTGA 
826 GGGGCTCGGA TGACGTGGCC ATTCTTTCOC TAAAGCATTG AGTTTACTC5C AAGGTCAGAA AAGCATGCAA 

Figure IB (continued) 896 agccctcaga atcgctgcaa agagctccaa caaaacaatt tagaacttta ttaaggaata gggggaacct 

966 AGGAAGAAAC TCAAAACATC AAGATTTTAA ATACGCTTCT TGGTCTCCTT GCTATAATTA TCTGGGATAA 

1036 GCATGCTGTT TrcrGTCTbT CCCTAACATG CCCTGTGATT ATCCGCAAAC AACACACCCA AGGGCAGAAC 

1106 TTTGTTACTT AAACACCATC L M I TGC IT CTTTCCTCA GGA ACT GTC OCT OCA OCA TCT GTC 

l>Gly Thr Val Ala AH Pro Sar Val 



1169 TTC 


ATC 


TTC 


CCG 


CCA 


TCT 


GAT 


GAG 


CAG 


TTC 


AAA 


TCT 


GGA 


ACT 


GCC 


TCT 


GTT 


GTG 


TGC 


9> Pha 


11* 


Pha 


Pro 


Pro 


Sar 


A»p 


Glu 


Gin 


Lau 


Ly« 


Sar 


3y 


Thr 


Ala 


Sar 


Val 


Vat 


Or* 


1226 CTG 


CTG 


AAT 


AAC 


TTC 


TAT 


ccc 


AGA 


GAG 


GCC 


AAA 


GTA 


CAG 


TGG 


AAG 


GTC 


GAT 


AAC 


GCC 


28>Lau 


Lau 


Asn 


Aan 


Pha 


Tyr 


Pro 


Arg 


GJm 


Ala 


Lya 


Val 


Qn 


Trp 


tya 


Val 


Aap 


Aan 


Ala 


1283 CTC 


CAA 


TCC 


GGT 


AAC 


TCC 


CAG 


GAG 


ACT 


GTC 


ACA 


GAG 


CAG 


GAC 


AGC 


AAG 


GAC 


AGC 


ACC 


47* Lau 


Qn 


Sar 


Gly 


Aan 


Sar 


Gin 


Glu 


Sar 


Val 


Thr 


Qu 


Gin 


Asp 


Sar 


Lya 


Asp 


Sar 


Thr 


1340 XAC 


AGC 


CTC 


AGC 


AGC 


ACC 


CTG 


ACG 


CTG 


AGC 


AAA 


OCA 


GAC 


TAC 


GAG 


AAA 


GAC 


AAA 


GTC 


66*Tyr 


Sar 


Lau 


Sar 


Sar 


Thr 


Lau 


Thr 


Lau 


Sar 


Lya 


Ala 


Aap 


Tfr 


Gu 


Lya 


Hia 




Val 



CK 



1397 TAC 


GCC 


TGC 


CAA 


GTC 


85>Tyr 


Ala 


Cya 


Glu 


Val 


1454 AGG 


GGA 


GAG 


TCT 


TAG 


104>Arg 


ay 




Cya 





Sad 



1519 TCCTTTGGCC TCTGACCCTT TTTCCACAGG NNMI £LL3£HL£ NNNN GGACCTACCC CTAXTGCGGT 
1586 CCTCCAGCTC ATCTTTCACC TCACCCCCCT C C I CC T CC 1 I GGCTTTAATT ATGCTAATGT TGGAGGAGAA 
1656 TGAATAAATA AAGTGAATCT TTGCACCTGT GGITIUCIC TTTCCTCAAT TTAATAATTA TTATCTGTTG 
1726 TTTACCAACT ACTCAATTTC TCTTATAAGG GACTAAATAT GTAGTCATOC TAAGGCCCAT AACCATTTAT 
1796 AAAAATCATC Cl'It A TTCXA TTTTACCCTA TCATCCTCTG CAAGACAGTC CTCCCTCAAA CCCACAAGCC 
1866 TTCTGTCCTC ACAGTCCCCT GGGCCGTGGT AGCAGAGACT TGCTTCCTTG VmUJU.lt: CTCAGCAAGC 
1936 CCTATAGTCC TTTTTAAGGG TGACAGGTCT TACGGTCATA TATCCTTTGA TTCAATTCCC TGGGAATCAA 

Sad 

A»p718I EooRI PstI Snal 

2006 CCAAGGCAAA TTTTTCAAAA GAAGAAACCT GCGGGTACCG AGCTCGAATT CCTCCAGCCC GGGGGATCCA 

2076 TCC 
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Sal I insert containing 431 light 
chain coding sequences 



11 C7TCGACCTCGAGCCA CC ATG GOA TSG AGC TGT ATC ATC CTC TTC TTC GTA 
61 GCA ACA GCT ACA G3TAAGGGGC TCACAGTAGC AGCCTTOAGG TCTGGACATA 
113 TA.TATGGGTG ACAATGACAT COUCTTTGCC TTTCTCTCCA CA GGTOTCCACTa: GAC 
170 ATC CAG ATC ACC CAG AGC CCA AGC AGC CTC AGC GCC AOC GTG GOT GAC 
218 AQA GTC ACC ATC ACC TCT AGT ACC AGC TCG AGT CTA AGT TAC ATC CAC 
266 TOG TAC CAG CAG AAG CCA GGT AAG OCT CCA AAG CTO CTC ATC TAC! AOC 
314 ACA TCC AAC CTG GCT TCT GGT C7IG CCA AGC AGA TTC AGC GGT AGC QGT 
362 AGC GOT ACC GAC TTC ACC TTC ACC ATC AGC AOC CTC CAG CCA GA£i GAC 
410 ATC GCC ACC TAC TAC TGC CAT CAG TOG AGT AGT TAT CCC ACG TTC GGC 
458 CAA GGG ACC AAG GTC G AAATO^AACGTGAGTAGAATTTAAACl'l'lV^'l^^lllAGTTC 
516 OATCCCC^TICTAAACTCTSA^^ 
580 OTACTC CAAGGTICAGAAAACOt^^ 
644 AATTTAGAACTITATTAAGGAATAGGGGQAAGC^ 

7Q8 srftCOCI tt'i\Miuiw^^ 

772 AACATGCCCTGTGATTATCCGCAAACAACACACCCAAGGCC^ 
836 TCCTQTTTGCTTCTTTCCTCAGGAACTGTCGCTO^ 
900 ATGAGGAGTTOAAATCTGCtAACTGCCTCTGTTGT^ 

964 GGCGAAAGTACACTC<5AAGGTGGATAACGC CCTCCAATCX3GGTAACTCCCAGGAGAGT 2TCACA 
1028 cy^cAGGACASOUWGGAC^ 
1092 ACGAGAAACACAAACrrCTAOXXT^^ 

Sail ( 1183) 

1156 GAGCTTCAACAGQC<^GAaTGTTAGAG<5TCGAC 



Figure 2 
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figure 3 
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FIGURE 



gpnomic construct of 431 HC link hum-ft-Gluc 

in pAB Stop 




Figure 4A 
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i LfY-TTATCAA TATGCAAATC CTGCTCATGA ATATGCAAAT CCTCTGAATC TACATGGTAA ATATAGGTTT 

1 Ncol (14C 

71 gtctatacca caaacagaaa aacatcagat cacagttctc tctacactta ctcagcacac aggacctcac c atg 

US GGA TGG AGC T3T ATC ATC CTC TTC TIG GTA GCA ACA GCT AC A GGTAAGCGGC TCACAGTAGC 

?»Giv Tro Ser Cys lie I I a Ltu Phe Lau Val Ala TM Ala Th f 
207 AGGCTTGAGG TCTGGACATA TATATGGGTG ACAATGACAT CCACTTrGCC mCTCTCCA CA GGT GTC CAC 
ivi «v**<_i4 l»Gly Val HI. 

OS? an Val Gin Leu Gin Glu S.r Gl y Pro Gly Ltu Val Arg Pro Sir Qn Thr Leu Ser 
,J CTG ACC TGC ACC GTG TCT GQC TTC ACC ATC AGC ACT GGT TAT AGC TGG CAC TGG GTG AGA 

24»L^ T^ C« Thr Va. Ser 3y Ph. Thr II. Sar Ser Gly Tyr Ser Trp Hi. Trp Va. Arg 
198 CAG CCA CCT GGA CGA GGT CTT GAG TGG ATT GGA TAG ATA CAG TAC ACT GGT ATC ACT AAC 

44>an Pro Pro Gly Arg Gly L.u Glu Trp II. Gl y Tyr II. Gin Tyr S., Gly ... Th r Ajjn 
458 TAC AAC CCC TCT CTC AAA AGT AGA GTG ACA ATG CTG GTA GAC ACC AGC AAG AAC CAG TTC 

64>Tyr A.n Pro Ser L.u Ly. Ser Arg V.I Thr M.I L.u V.I A.p Thr Bar Ly. A.n Gin Ph. 
5 S ACC CTG AGA CTC AGC AGC GTG ACA GCC GCC GAC ACC GOG GTC TAT TAT TGT OCA AGA GAA 

84>s77 Leu Arg L.u S.r Ser Val Thr Ala Ala A.p Thr Ala V.I Tyr Tyr Cys Ala Arg Glu 
578 GAC TAT GAT TAC CAC TOG TAC TTC GAT GTC TGG GGT CAA GCC AGC CTC GTC ACA GTC ACA 
104>ASP Tyr Asp Tyr HI. T,p Tyr Ph. A.p V.I Trp Gly Gin Gly Str L.u V.I Thr V.I Thr 
638 GTC TCC TCA GGTGAGTCCT TACAACCTCT CTCTTCTATT CAGCTTAAAT AGATTTTACT GCATTTGTTG 

737 ^ GGGGGGAAM^ GTGTGTATCT GAATTTCAGG TCATGAAGGA CTAGGGACAC CTTGGGAGTC AGAAAGGGTC 

111 AxTGGGAGCC GTGGCTCATG CAGACAGACA TCCTCAGCTC CCACACCTCA TOSCCAGAGA TT^TAOGCA 

847 TCAGCTTTCT GGGGCAGGCC AGGCCTCACT TTGGCTGGGG GCAGGGAGGG GGCTAAGGTG ACOCAGGTGG 

917 CCCCAGCCAG GCGCACACCC AATGCCCGTG AGCCCAGACA CTCGACCCTG CCTGGACCCT CGTGGATAOA 

W CAAGAACCGA GGGGCCTCTG CGCCCTGGGC CCAGCTCTGT CCCACACCCC AGTCACATGG C^CATCTCr 

lSs7 CTTGCA GCT TCC ACC AAG GGC CCA TOG GTC TTC CCC CTG GCC CCC TOC TCC AGG AGC ACC TCT 
1057 CTTGCA GCT ^ ^ ay ph§ pf0 Leu A|8 pr „ Cy , g,, Arg g., Thr g., 

U20 GGG GGC' ACA GOG GCC CTG GGC TGC CTC GTC AAG GAC TAC TTC CCC GAA COG CTG AGG GTC 
"►rfy ST Thr Ala Al, L.u G.y Cy. L.u V.. Ly. A.p Tyr Ph. Pro Qu Pro VjU Thr £. 

•■30 TCG TGG AAC TCA GGC GCC CTG ACC AGC GGC GTG CAC ACC TTC CCo GCT GTC CTA CAG TCC 
2S»£r Trp A.n Sar Gly Ala L.u Thr Ser Gl y V.I Hi. Thr Ph. Pro Ala V.I L.u Gin Sar 

BstXI (1280) 

,,40 TCA GGA CTC TAC TCC CTC AGC AGC GTG GTG ACC GTG CCC TCC AGC AGC TTG GGC ACC CAG 
60»Sel «y Leu Tyr Ser L.u Ser Sir Val V.I Thr V.I Pro Sar S.r Sir L.u Gl y Thr Ql r, 

130U tcC TAC ACC TGC AAC GTG AA.T CAC AAG CCC AGC AAC ACC AAG GTG GAC AAG AGA GTT 

ll>™ Tyr Thr Cy. A.n Va. Asn HI. Lya Pro Sar A.n Thr Ly, Val Asp LysArs , Val 

13 5? S^-AGAGGC CAGCGCAGGG AGGGAGGGTG TCTGCTGGAA GCCAGGCTCA GCCCTCCTGC CTCGATCCAT 
«7 ScGGCTCTG CAGTCCCAGC CCAGGGCAGC AAGGCAGGCC CCGTCTGACT CCTCACCCGG AGCCTCTGCC 
U97 CGCCCCACrC ATGCTCAGCG AGAGGGTCTT CTGGCTTTTT CCACCAGGCT CCGCGCACGC ACA^CTCGA 

inSSCTCKSCJTCCCCACCGTCCCXX OCTMCCCK CCCAGGACTC GCCCrCCKC TCMCGCOT 



j>Ala Ala Al. Ala Val Gin Gly Gly M.t L.u Tyr 

19 7S CCC CAG GAC AGC CCG TCG COG GAG TGC AAG CAG CTG GAC GGC CTC TGG AGC TTC CGC GCC 

l2 *P,o Gin Glu Sar Pro Sar Arg Glu Cys Lys Glu Leu Asp Gly Lau Trp Ser Ph. Arg Ala 



WO 01/19842 



7/16 



PCMJSOO/25558 



FIGURE (Continued) 

Not) (20B2) 

2035 GAC TTC TCT GAC AAC CGA OQC CGG GGC TTC GAG GAG GAG TGG TAC CGG CGG CCG CTG TGG 

32>Asp Phe S»r Asp Asn Arg Arg Arg Q y Phe Gl u Glu Gin Trp Tyr A rg Arg Pro Leu Trp 

2095 GAG TCA GGC ..CCC ACC GTG GAC ATG CCA GTT CCC TCC AGC TTC AAT GAC ATC AGC CAG CiAC 

52>Glu Set Gly Pro Thr Val Asp Met Pro Val Pro Ser Ser Phe Asn Asp lie Ser Gin Asp 

2155 TOG CGT CTC CGG CAT TTT GTC GGC TGG CTG TGG TAC GAA CGG GAG GTG ATC CTG CCG GAG 

1 72>Trp Arg Leu Arg His Phe Val Gly Trp Val Trp Tyr Gl u Arg Gl u Val Me L«u Pro Gl u 

2215 CGA TGG ACC CAG GAC CTG CGC ACA AGA GTG GTG CTG AGG ATT GGC AGT GCC CAT TCC TAT 

92>Arg Trp Thr Qn Asp Leu Arg Thr Arg Val Val Leu Arg lie Gly Ser Ala His Ser Tyr 

Salt (2296) 

2275 GCC ATC GTG TGG GTG MT GGG GTC GAC ACG CIA GAG CAT GAG GGG GGC TAC CTC CCC TTC 

112Mla Me Val Trp Val Asn Gly Val Asp Thr Ltu Glu His Glu Gly Gly Tyr Leu Pro Phe 

2335 GAG GCC GAC ATC AGC AAC CTG (TTC CAG GTG GGG CCC CTG CCC TCC CGG CTC CGA ATC ACT 

132 ► Glu Ala Asp lie Scr Asn Leu Vai Gin Val Gly Pro Lau Pro Ser Arg Leu Arg its Thr 

2395 ATC C;CC ATC AAC AAC ACA CTC ACC CCC ACC ACC CTG CCA CCA GGG ACC ATC CAA TAC CTG 

152Mle Ala Me Asn Asn Thr Leu Thr Pro Thr Thr Leu Pro Pro Gly Thr 1 1 • Gl n Tyr Leu 

2455 ACT GAC ACC TCC AAG TAT CCC AAG GGT TAC TTT GTC CAG AAC ACA TAT TTT GAC TTT TTC 

172^Thr Asp Thr Ser Lys Tyr Pro Lys Gly Tyr Phc Val Gin Asn Thr Tyr Phe Asp Phe Pha 

2515 AAC TAC OCT GSA CTG CAG CGG TCT GTA CTT CTG TAC ACG ACA CCC ACC ACC TAC ATC GAT 

192Msn Tyr Ala Gly Leu Gin Arg Ser Val Leu Leu Tyr Thr Thr Pro Thr Thr Tyr He Asp 
BstXI (2588) BglU (2627) 

2575 GAC ATC ACC GTC ACC ACC AGC GTG GAG CAA CAC AGT GGG CTG GTG AAT TAC CAG ATC TCT 

212>Asp He Thr Val Thr Thr Ser Val Gl u Gin Asp Ser Gly Leu Val Asn Tyr Gin lie Ser 

2635 GTC AAG GGC AGT AAC CTG TTC AAG TTG GAA GTG CGT CTT TTC GAT CCA GAA AAC AAA ^iT 

232>Val Lys Gly Ser Asn Leu Phe Lys Leu Gl u Vat Arg Leu Leu Asp Ala Gl u Asn Lya Val 

2695 GTG GCG AAT GGG ACT GGG ACC CAG GGC CAA CTT AAG GTG CCA (5GT GTC AGC CTC TGG TGG 

252>Val Ala Asn Gl y Thr Gl y Thr Gl n Gl y Gl n Leu Lys Val Pro Gi y Val Ser Leu Trp Trp 
2755 CCG T.AC CTG ATG CAC GAA CGC CCT GCC TAT CTG TAT TCA TTG GAG GTG CAG CTG ACT GTA 

272>Pro Tyr Leu Met His Gl u Arg Pro Ala Tyr Leu Tyr Ser Leu Gl u Val Gin Leu Thr Ala 

BamHI (2861) 

2815 CAG ACG TCA CTG GGG CCT GTG TCT GAC TTC TAC ACA CTC CCT GTG GOG ATC CGC ACT CTG 

292>Gln Thr Ser Leu Gl y Pro Val Ser Asp Phe Tyr Thr Leu Pro Val Gly lie Arg Thr Val 

2875 GCT GTC ACC AAG AGC CAG TIC CTC ATC AAT GGG AAA CCT TTC TAT TTC CAC GGT GTC AAC 

312>Ala Val Thr Lys Ser Gl n Phe Leu I I e Asn Gly Lys Pro Phe Tyr Phe Hi s Gl y Val Asn 

2935 AAG CAT GAG CWT GCG GAC ATC CGA GGG AAG GGC TTC GAC TGG CCG CTG CTG GTG AAG GAC 

332>Lvs His Glu Asp Ala Asp lie Arg Gly Lys Gly Phe Asp Trp Pro Leu Leu Val Lys Asp 

2995 A: AAC CTG CTT CGC TGG CTT GGT GCC AAC GCT TTC CGT ACC AGC CAC TAC CCC TATGCA 

352>Phe Asn Leu Leu Arg Trp Leu Gly Ala Asn Ala Phe Arg Thr Sjk His Tyr Pro Tyr Ala 

3055 GAG GAA GTG ATG CAG ATG TCT GAC CGC TAT GGG ATT CTG GTC ATC GAT Gr\G TCT ncc 30C 

372>Glu Glu Val Met Gin Met Cys Asp Arg Tyr Gly He Val Val lie Asp Glu Cys Pro Gly 
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FIGURE 4B (Continued) 



3115 gtg 


GGC ~TT' 


GCG 


CTG 


COG 


CAG 


TTC 


TTC 


AAC 


AAC 


GTT 


TCT 


CTG 


CAT 


CAC 


CAC 


ATG 


CAG 


GTG 


392> Val 


Gty Leu 


At a 


Leu 


Pro 


GJn 


Phe 


Phe 


Asn 


Asn 


Val 


Ser 


Leu 


His 


His 


His 


Met 


Gin 


Val 


3175 ATG 


GAA GAA 


GTG 


GTG 


CGT 


AGG 


GAC 


AAG 


AAC 


CAC 


CCC 


GCG 


GTC 


GTG 


ATG 


TGG 


TCT 


GTG 


GCC 



412>MbI Glu Glu Val Val Arg Arg Asp Lys Asn His Pro Ala Val Val Met Trp Ser Val Ala 

3235 AAC GAG CCT GCG TCC CAC CTA GAA TCT GCT GGC TAC TAC TTG AAG ATG GTG ATC GCT CAC 

432>Asn Glu Pro Ala Ser His Leu Glu Ser Ala Gl y Tyr Tyr Leu Lys Met Val lie Ale His 
BstXI (3296) 



AAA 


TCC 


TTG 


GAC 


CCC 


TCC 


COG 


CCT 


GTG 


ACC 


TTT 


GTG 


AGC 


AAC 


TCT 


AAC 


TAT 


GCA 


Lye 


Ser 


Leu Asp 


Pro 


Ser 


Arg 


Pro 


Val 


Thr 


Phe 


Val 


Ser 


Asn 


Ser 


Asn 


Tyr Ala 


AAG 


GGG 


GCT 


CCC 


TAT 


GTG 


GAT 


GTG 


ATC 


TGT 


TTG 


AAC 


AGC 


TAC 


TAC 


TCT 


TGG 


TAT 


Lys 


GJy 


Al a 


Pro 


Tyr 


Val 


Asp Val 


He 


Cys 


Leu 


Asn 


Ser 


Tyr 


Tyr 


Ser 


Trp 


Tyr 


TAC 


GGG 


CAC 


CTG 


GAG 


TTG 


ATT 


CAG 


CTG 


CAG 


CTG 


GCC 


ACC 


CAG 


TTT 


GAG 


AAC 


TGG 


Ty r G y 


His 


Leu 


Glu 


Leu 


fie 


Gin 


Leu 


Gin 


Leu 


Ala 


Thr 


GJn 


Phe 


Gl u Asn Tfp 


AAG 


TAT 


CAG 


AAG 


CCC 


ATT 


ATT 


CAG 


AGC 


GAG 


TAT 


GGA 


GCA 


GAA 


ACG 


ATT 


OCA 


GGG 


Lys 


Tyr 


Gin 


Lys 


Pro 


Me 


lie 


Gin 


Ser 


Glu 


Tyr 


Giy Ala 


Glu 


Thr 


lie 


Ala 


Gl y 


BamHI 


(3540) 






























CAG 


GAT 


CCA 


CCT 


CTG 


ATG 


TTC 


ACT 


GAA 


GAG 


TAC 


CAG 


AAA 


ACT 


CTG 


CTA 


GAG 


CAG 


Gl n 


Asp 


Pro 


Pro 


Leu 


Met 


Phe 


Thr 


Gl u 


Glu 


Tyr 


Gin 


Lys 


Ser 


Leu 


Leu 


Glu 


Gin 


CTG 


GGT 


CTG 


GAT 


GAA 


AAA 


CGC 


AGA 


AAA 


TAT 


GTG 


GTT 


GGA 


GAG 


CTC 


ATT 


TGG 


AAT 



3715 



3775 GCC 



Leu 


Gly 


Leu 


Asp 


GAT 


TTC 


ATG 


ACT 


Asp 


Phe 


Met 


Thr 


CGG 


CAG 


ACV\ 


CAA 


A rg 


Gin 


A rg 


Gin 


AAT 


CiAA 


ACC 


AGG 


Asn 


Glu 


Thr 


Arg 



Val GJ y GJ u Leu Me Trp Asn Phe 

CTG GGG AAT AAA AAG GOG ATC . TTC 

Leu GJy Asn Lys Lys Gly lie Phe 



r Pro His Ser Val Ala Lys Ser Gin Cys Leu Glu Asn Ser Pro 

3935 TTT ACT TGA GCAAGACTGA TACCACCTGC GTGTCCCTTC CTCCCCGAGT CAGGGCGACT TCCACAGCAG 

632> Phe Thr • • • 

3904 CAGAACAAGT GCCTCCTGGA CTGTTCACGG CAGACCAGAA CGTTTCTGGC CTGGGTTTTG TGGTCATCTA 
3974 TTCTAGCAJGG CiAACACTAAA GGTGGAAATA AAAGATTTTC TATTATGGAA ATAAAGAGTT GGCATGAAAG 

Xbal (4063) 
Sail (4057) SamHI (4060) 
4044 TCGCTACTGN NNNGTCGACT CTAGAGGATC CCCGCTTAAT TAAGTTGTTT ATTGCAGCTT ATAATGGTTA 
4114 CAAATAAAGC AaTAGCATCA CAAATTTCAC AAATAAAGCA TTTTTTTCAC TGCATTCTAG TTGTGGTTTG 

BamHI 

4184 TCCAAACTCA TCAATGTATC TTATCATGTC TGGATCCGAA TTGATCCCCT GGAGACTTGG AAATCCCCCT 

Ncol 

4254 GAGTCAAACC GCTATCCACG CCCATTGATG TACTGCCAAA ACCGCATCAC CATGGTAATA GCGATGACTA 
4324 ATACGTAGAT GTACTGCCAA GTAGGAAAGT CCCATAAGGT CATGTACTGG GCATAATCCC AGGCCGGCCA 
4394 TTTACCCTCA TTGACGTCAA TAGGGGGCGT ACTTGGCATA TGATACACTT GATGTACTGC CAAGTGGGCA 
4464 GTTTACCGTA AATACTCCAC CCATTGACGT CAATGGAAAG TCCCTATTGG CGTTACTATG GGAACATACG 
4534 TCATTATTGA CGTCAATGGG CGGGGGTCGT TGGGCGGTCA GCCAGGCGGG CCATTTACCG TAAGTTATGT 
4604 AACGCGGAAC TCCATATATG GGCTATGAAC TAATGACCCC GTAATTGATT ACTATTAATA ACTAGTCAAT 
4674 AATCAATGTC CGAGCTCGAA ATTCTTGAAG ACGAAAGGGC CTCGTGATAC GCCTATTTTT ATAGGTTAAT 
4744 GTCATGATAA TAATGGTTTC TTAGACGTCA GGTGGCACTT TTCGGGGAAA TGTGCGCGGA ACCCCTATTT 
4814 GTTTATTTTT CTAAATACAT TCAAATATGT ATCCGCTCAT GAGACAATAA CCCTGATAAA TGCTTCAATA 



WO 01/19842 



9/16 



PCT/USOO/25558 



FIGURE 4B (Continued) 



4884 ATATTGWA AGGAAGAGTA TGAGTATT CAA CAT TTC CGT GTC QCC CTT ATT CCC TTT TTT GCG GC 

l>Qn Hl» Phe Arg Val At a Leu lie Pro Phe Phe At a ai 



4951 TTT 


TGC 


CTT 


CCT 


GTT 


TTT GCT CAC 


CCA 


GAA 


ACG 


CTG 


GTG 


AAA 


GTA 


AAA 


GAT 


GCT 


GAA 


GAT 


14>Phe 


Cys 


Leu 


Pro 


Val 


Phe 


Ala 


His 


Pro 


Glu 


Thr 


Leu 


Val 


Lys 


Val 


Lys Asp 


Ala 


Glu Asp 


5011 CAG 


TTG GGT..GCA 


CGA 


GTG GGT TAC 


ATC 


GAA 


CTG 


GAT 


CTC 


AAC 


AGC 


GGT 


AAG 


ATC 


CTT 


GAG 


34 ► G) n 


LeU 


lm y 


A 1 a 


A rg 


Val 


Gly Tyr 


Me 


Glu 


Leu Asp 


Leu Asn 


Ser 


Gly 


Lys 


f 1 e 


Leu 


GlU 


5071 ACT 


TTT 


CGC 


CCC 


GAA 


GAA 


CGT 


TTT 


CCA 


ATC 


ATG 


AGC 


ACT 


TTT 


AAA 


GTT 


CTG 




TCT 


GGC 


54 ^ Ser 


Phe 


A rg 


r ro 


Gl u 


Glu 


Arg 


Phe 


Pro 


Met 


tot 


Ser 


Thr 


Phe 


Lys 


Val 


Leu 


LIU 


Cys 


Gly 


5131 GCG 


GTA 


TTA 


TCC 


CGT 


GTT 


GAC 


GCC 


GGG 




GAG 


CAA 


CTC 


GGT 


CGC 


CGC 


ATA 




TAT 


TCT 


74 ^ A 1 a 


Val 


Leu 


Ser 


A rg 


Vel 


Asp Ala 


Gl v 


Gl n 


GJu 


Gin 


Leu 


Gly 


A rg A rg 


lie 


ni s 


Tyr Ser 












Seal (5207) 


























C 1 Q 1 /- xr; 

UAL* 


AAT 


GAC 


TTG 


w i i. 


GAG 


TAC 


TCA 




GTC 


ACA 


GAA 


AAG 


CAT 


CTT 


ACG 


GAT 


GGC 


ATC 


ACA 


QA CXI n 


Asn 


Asp 


Leu 


Va 1 


Glu 


Tyr 


Ser 


P rn 


Ma. 1 


Thr 


Gl u 


Lys 


His 


Leu 


Thr 


Asp 


Gl y 


Mai 


Thr 


5 Z 3 1 GTA 


AGA 


GAA 


TTA 




ACT 


GCT 


GCC 


ATA 


ACC 


ATG 


AGT 


GAT 


AAC 


ACT 


GCG 


GCC 


AAC 


TTA 


CTT 


1 1 / k Via 1 

1 1 4 * vai 


Arg 


GlU 


Leu 


uys 


Ser 


Ala 


Ala 


1 1 e 


i nr 


Mel 


Ser 


Asp Asn 


Thr 


Ate 


Ala 


Asn 


Leu 


Leu 


5311 CTG 


ACA 


ACG 


ATC 


GGA 


GGA 


CCG 


AAG 


GAG 


CTA 


ACC 


GCT 


TTT 


TTG 


CAC 


AAC 


ATG 


GGG 


GAT 


CAT 




Thr 


Thr 


Me 


oi y 


Gly 


Pro 


Lys 


ul U 


1 mi 

LSU 


Thr 


AI a 


Phe 


Leu 


His 


Asn 


Met 


Gly 


Asp 


His 


5371 GTA 


ACT 


CGC 


CTT 


GAT 


CGT 


TOG 


GAA 


CCG 


GAG 


CTG 


AAT 


GAA 


GCC 


ATA 


CCA 


AAC 


GAC 


GAG 


CGT 


154 ►Val 


Thr 


Arg 


Leu 


Asp 


Arg 


Trp 


Glu 


Pro 


Gu 


Leu 


Asn 


Glu 


AI a 


lie 


Pro 


Asn 


Asp 


Glu Arg 


5431 GAC 


ACC 


ACG 


ATC 


CCT 


GCA 


GCA 


ATG 


GCA 


ACA 


ACG 


TTC 


CGC 


AAA 


CTA 


TTA 


ACT 


GGC 


GAA 


CTA 


174>A$p 


Thr 


Thr 


Met 


Pro 


Ala 


Ala 


Met 


Ale 


Thr 


Thr 


Leu 


Arg 


Lys 


Leu 


Leu 


Thr 


Gly 


Glu 


Leu 


5491 CTT 


ACT 


CIA 


GCT 


TCC 


COG 


GAA 


CAA 


TTA 


ATA 


GAC 


TCG 


ATC 


GAG 


GCC 


GAT 


AAA 


GTT 


GCA 


GGA 


194 ► Leu 


Thr 


Leu 


Ala 


Ser 


Arg 


Gin 


Gin 


Leu 


Me 


Asp Trp 


Met 


Glu 


Ala Asp 


Lys 


Val 


Ala 


Gly 


5551 CCA 


CTT 


CTG 


CGC 


TCG 


QCC 


CTT 


CCG 


GCT 


CGC 


TOG 


TTT 


ATT 


GCT 


GAT 


AAA 


TCT 


GGA 


GCC 


GGT 


214> Pro 


Lau 


Leu 


A '9 


Ser 


Ala 


Leu 


Pro 


Ala 


Gly 


Trp 


Phe 


He 


AI a 


Asp 


Lys 


Ser 


Gly 


Ala 


Gly 


5611 GAG 


CGT 


GGG 


TCT 


CQC 


OCT 


ATC 


ATT 


GCA 


GCA 


CTG 


GGG 


CCA 


GAT 


GGT 


AAC 


CCC 


TCC 


CGT 


ATC 


2!*4>Glu 


A rg 


Gly 


5er 


A rg 


Gly 


He 


He 


Ala 


Ala 


Leu 


Gly 


Pro Asp 


Gly 


Ly« 


Pro 


Ser 


Arg 


Me 


5671 GTA 


GTT 


ATC 


TAC 


ACG 


ACG 


GGG 


AGT 


CAG 


GCA 


ACT 


ATC 


GAT 


GAA 


CGA 


AAT 


AGA 


CAG 


ATC 


Cjct 


2S4> Val 


Val 


1 te 


Tyr 


Thr 


Thr 


Gly 


Ser 


an 


Ala 


Thr 


Met 


ASP 


Glu 


A rg 


Asn 


A rg 


Gin 


I te 


Ala 


5731 GAG 


ATA 


GGT 


GCC 


TCA 


CTG 


ATT 


AAG 


CAT 


TOG 


TAA 


CTGTCAGACC AAGTTTACTC ATATATACTT 


274>Glu 


1 f e 


G* y 


Ala 


Ser 


Leu 


I le 


Lys 


Hit 


Trp 


• • • 





















5794 TAGATTGATT TAAAACTTCA TTTTTAATTT AAAAGGATCT AGGTGAAGAT CCTTTTTGAT AATCTCATGA 

5 8 54 CCAAAATCCC TTAACGTGAG TTTTCGTTCC ACTGAGCGTC AGACCCCCTA GAAAAGATCA AAGGATCTTC 

5934 TTGAGATCCT TTTTTTCTGC GCGTAATCTG CTGCTTCCAA ACAAAAAAAC CACOGCTACC AGCG G TG G TT 

5004 TGTTTGCCGG ATCAAGAGCT ACCAACTCTT TTTCCGAAGG TAACTGGCTT CAGCAGAGCG CAGATACCAA 

6074 ATACTGTCCT TCTAGTGTAG CCGTAGTTAG GCCACCACTT CAAGAACTCT GTAGCACCGC CTACATACCT 

5144 CGCTCTGCTA ATCCTGTTAC CAGTCGCTGC TGCCAGTGGC GATAAGTCGT GTCTTACCGG 3TTGGACTCA 

6214 AGACGATAGT TACCGGATAA GGCGCAGCGG TCGGGCTGAA CGGGGGGTTC GTGCACACAG CCCACCTTGG 

6284 AGCGAACGAC CTACACCGAA CTGAGATACC TACAGCGTGA GCATTGAGAA AGCGCCACGC TTCCCGAAGG 



.6354 GAGAAAGGCG GACAGGTATC CGGTAAGCGG CAGGGTCGGA ACAGGAGAGC GCACGACGGA GCTTCCAGGG 



6424 


GCAAACGCCT GCTATCTTTA TAGTCCTGTC 


GQGTTTCGCC 


ACCTCTGACT 


TGAGCGTCGA 


TTnTGTGAT 


6494 


GCTCGTCAGG CGGGCGGAGC CTATGGAAAA 


ori 

ACGCCAGCAA 


CGCGGCCTTT 


TTACGGTTCC 


TGGCCTTTTG 



5 5 54 CTGGCCTTTT GCTCACATGT TCTTTCCTGC GTTATCCCCT GATTCTCT GC ATAACCGTAT TACCGCCTTT 



6634 GAGTGAGCTG ATACCGCTCG CCGCAGCCGA ACGACCGAGC GCAGCGAGTC AGTGACCGAG GAAGCCGAAG 

6704 AGCGCCTGAT GCGGTATTTT CTCCTTACGC ATCTGTGCGG TATTTCACAC CGCATATGGT GCACTCTCAG 

6774 TACAATCTGC TCTGATGCCG CATAGTTAAG CCAGTATACA CTCCGCTATC GCTACGTGAC TQGGTCATCG 

6844 CTGCGCCCCG ACACCCGCCA ACACCCGCTG ACGCGCCCTG ACGGGCTTGT CTGCTCCCGG CATCCGCTTA 

6914 CAGACAAGCT GTGACCGTCT CCGGGAGCTC CATGTGTCAG AGGTTTTCAC CGTCATCACC GAAACGCGCG 

6984 AGGCAGCTGT GGAATGTGTG TCAGTTAGGG TGTGGAAAGT CCCCACCCTC CCCAGCAGQC AGAAGTATGC 

7054 AAAGCATGCA TCTCAATTAG TCAGCAACCA GGTCTGGAAA GTCCCCAGGC TCCCCAGCAG GCAGAAOTAT 

7124 GCAAAGCATG CATCTCAATT AGTCAGCAAC CATAGTCCCG CCCCTAACTC CGCCCATCCC GCCCCTAACT 

Ncol 

7194 CCGCCCAGTT CCGCCCATTC TCCGCCCCAT GGCTGACTAA TTTTTTTTAT TTATGCAGAG GCCGAGGCCG 
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FIGURE AB (Continued) 

Hindlll (7328) 

7264, CCTCGGCCTC TGAGCTATTC CAGAAGTAGT GAGGAGGCTT TTTTGGAGGC CTAGGCTTTT GCAAA 
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1 CTCGAGCCAC C ATG GGA TOG A3C TGT ATC ATC CTC TTC TIG OTA OCA ACA GCT AC A 
l*Met Gly Trp Ser Cys lie lis Leu Phe Leu Val Ala Thr Ala Th r 

57 CGTAAGGGGC TCACAGTAGC AGGCTTGAGG TCTGGACATA TATATGGGTG ACAATGACAT CCACTTTGCC 

127 tttctctcca ca got gtc cac toc cab otc caa ctq cag gag agc ggt cca got ctt gto 
l»Gly Val His Ser Gin Val Gin Leu Gin Gl u Ser Gly Pro Gly Leu Val 
187 AGA CCT AGC CAG ACC CTG AGC CTQ ACC TOC ACC GTG TCT GGC TTC ACC ATC AGC ACT 

17>Ara Pro Ser Gin Thr Leu Ser Leu Thr Cy» Thr Val Ser Gly Pha Thr He Ser Sar 
244 GGT TAT AGC TGG CAC TOG GTG AGA CAG CCA CCT GGA CGA GOT CTT GAG TGG ATT GGA 

36>Glv Tyr Sar Trp His Trp Val Arg Gin Pro Pro Gly Arg Gly Lau Glu Trp lie Gly 
301 TAC ATA CAG TAC AGT QGT ATC ACT AAC TAG AAC CCC TCT CTC AAA AGT AGA GTO ACA 

55>Tyr He Gin Tyr Ser Gly lie Thr Aan Tyr Aen Pro Ser Lau Lye Ser Arg Val Thr 
358 ATG CTG GTA CAC ACC AGC AAQ AAC CAG TTC AGC CTG AGA CTC AGC AGC: GTG ACA GCC 

74>Met Leu Val Asp Thr Ser Lya Asn Gin Phe Ser Leu Arg Lou Ser Ser Val Thr Ala 
415 GCC GAC ACC GCG GTC TAT TAT TGT GCA AGA GAA GAC TAT GAT T31C CAC TGG TAC TTC 

93>Ala Asp Thr Ala Val Tyr Tyr Cys Ala Arg Glu Asp Tyr Asp Tyr Hi.. Trp Tyr Phe 
472 GAT GTC TGG GGT CAA GGC AGC CTC GTC ACA GTC ACA GTC TCC TCA GGTGAGTCCT 

112»Asp Val Trp Gly Gin Gly Ser Leu Val Thr Val Thr Val Se r Sar , mttirTp 

527 TACAACCTCT CTCTTCTATT CAGCTTAAAT AGA3TTTACT GCATTTGTTC GGGGGGAAAT GTGTGTATCT 
597 GAATTTCAGG TCATGAAGGA CTAGGGACAC CTTGGGAOTC AGAAAGGGTC ATTGGGAI3CC GTGGCT0AT3 
667 CAGACAGACA TCCTCAOCTC CCAGACCTCA TGGCCAGAGA TTTATAGGGA TCAGCITrCT GGGOCAGOCC 
737 AGGCCTGACT TTGGCTGGGG GCAGGGAGGG GGCTAAGGTG ACGCAGGTGG CGCCAGCCAG GCGC ACACCC 
807 AATGCCCGTG AGCCCAGACA CTGGACCCTG OTGGACCCT CGTGGATAGA CAACAACCGA OLdMAMM 
877 CGCCCTGGGC CCAGCTCTCT CCCACACCGC AGTCACATQG CGCCMCTCT COTGCA 3CT TCC ACC AAG 

945 GGCCCATCGGTCTTCCCCCTCGCGCXETOCTOCAM 

5>G4v Pro Sar Val Phe Pro Leu Ala Pro Cye Ser Arg Ser Thr Ser Gly Gly Thr Ala 
1002 GCC CTG GGC TGC CTG GTC AAG GAC TAC TIC CCC GAA CCG GTG AGO GTG TCG TOO AAC 

24»Ala Leu Gly Cya Leu Val Lys A«p Tyr Phe Pro Glu Pro Val Thr Val Ser Trp Aan 
1059 TCA OQC GCC CTG ACC AGC GGC CTG CAC ACC TTC CCG GCT GTC CIA CAG TCC TCA GGA 

43»Ser Gly Ala Leu Thr Ser Gly Val Hi a Thr Phe Pro Ala Val Leu Gin Ser Sar Gly 

BstEII (1138) BaiW (1150) 

1116 CTC TAC TCC CTC AGC AGC GIG GTG ACC GTC CCC TCC AGC ACC TTG GCC ACC CAG ADC 

62»Leu Tyr Ser Leu Ser Ser Val Val Thr Val Pro Ser Ser Ser Leu Gly Thr Gin Thr 
1173 TAC ACC T3C AAC GTG AAT CAC AAG CCC AGC AAC ACC AAG GTG GAC AAG AGA GMT 

81>Tvr Thr Cys Asn Val Asn Hia Lya Pro Ser Asn Thr Lya Val Aap Lys Arg Val 
1227 GGIGAGAGGC CAGCGCAGGG AGGGAGGGTG TCTGCTGGAA GCCAGGCTCA GCCCTCCTOC CTGGACGCAT 
1297 CCCGGCTGTG CAGTCCCAGC CCAGGGCAGC AAGGCAGGCC CCGTCTGACT CCTCACCrGG AGCCTCTGCC 
1367 CGCCCCACTC ATGCTCAOQG AGAQQGTCTT CTGGCTTTTT CCACCAGGCT CCGGOCAGGC ACAGGCTGGA 
1437 TGCCCCTACC CCAGGCCCIT CACACACAGC GGCAGGTGCT GCGCTCAGAG CTGCCAJAAG CCATATCCAG 

o 

1507 QAGGACCCTS CCCCTGACCT AAGCCCACCC CAAAGGCCAA ACTCTCTACT CACTCAGCTC ACGCATCCAC 
1577 CTCCATCCCA GATCCCCGTA ACTCCCAATC TTCTCTCTGC A GCG GCG GCG GCG GTG CAG GGC GGG 

l>Ala Ala Ala Ala Val Gin Gly Gly 

1642 ATG CTG TAC CCC CAG GAG AGC CCG TCG CGG GAG TGC AAG GAG CTG GAC GGC CTC TGG 

9»Met Leu Tyr Pro Gin Glu Sar Pro Ser Arg Glu Cys Lys Glu Leu Asp Gly Leu Trp 

1699 AGC TTC CGC GCC GAC TTC TCT GAC AAC CGA CQC CGG GGC TTC GAG GftG CAG TGG TAC 

28»Ser Phe Arg Ala Asp Phe Ser Asp Asn Arg Arg Arg Gly Pha Glu Glu Gin Trp Tyr 

1756 CGo'cGS CTG TGG GAG TCA GGC CCC ACC GTG GAC ATC CCA GTT CCC TCC AGC TTC 

47»Arg Arg Pro Leu Trp Glu Ser Gly Pro Thr Val Asp Mai Pro Val Fro Ser Sar Phe 

1813 AAT GAC ATC AGC CAG GAC TGG CGT CTG CGG CAT TTT GTC GGC TGG GTO TGG TAC GAA 

66>Asn Asp lie Sar Gin Asp Trp Arg Lau Arg His Pha Val Gly Trp Val Trp Tyr Glu 
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1870 CGG GAG C3TG ATC CTG OCG GAG OGA TOG ACC CAG GAC CTG CGC ACA AGA GTG GTG CTG 

85>Arg Glu Val Me Leu Pro GJ u A rg Trp Thr Gin Asp Leu Arg Thr Arg Vol Vm\ Leu 

1927 AGG ATT GGC AGT GCC CAT TCC TAT GCC ATC GTG TGG GTC AAT GGG GTGGAT ACG CTA G? 

2<Hlslle 

104>Arg Me Gt y Ser Ala His Ser Tyr Ala lie Val Trp Val Asn Gly ValAsp Thr Leu Gl 
1987 CAT GAG CGG GGC TAC CTC CCC TIC GAG GCC GAC ATC AGC AAC CTQ GTC CAG GTO GGG 

(L24>His GIU Gly Gly Tyr Leu Pro Phe Glu Ala Asp He Ser Asn Leu Val Gin Val Gly 
2044 CCC CTC CCC TCC CGG CTC CGA ATC ACT ATC GCC ATC AAC AAC ACA CTC ACC CCC ACC 

143>Pro Leu Pro Ser Arg Leu Arg lie Thr Me Ala Me Asn Asn Thr Leu Thr Pro Thr 

2101 ACC CTG CCA CCA GGG ACC ATC CAA TAT CTG ACT GAC ACC TCC AAG TAT CCC AAG GGT 

162> Thr Leu Pro Pro Gly Thr Me Gin Tyr Leu Thr Asp Thr Ser Lys Ty Pro Lys Gly 

2158 TAC ITT GTC CAG AAC ACA TOT TTT GAC TTT TIC AAC TAC GCT GGA CTC CAG CGG TCT 

181* Tyr Phe Val Gin Asn Thr Tyr Phe Asp Phe Phe Asn Tyr Ala Gly Leu Gin Arg Ser 

BstXI (2264) 

2215 GTA CTT CTG TAC ACG ACA CCC ACC ACC TAC ATC GAT GAC ATC ACC GTC ACC ADC AGC 

200>Val Leu Leu Tyr Thr Thr Pro Thr Thr Tyr Me Asp Asp M e Thr Val Thr Thr Ser 

Bglll (2303) 

2272 GTG GAG CAA GAC AGT GGG CIO GTO AAT TAC CAG ATC TOT GTC AAG GGZ AGT AAC CTG 

219>Val Glu Gin Asp Ser Gly Leu Val Asn Tyr Gin Me Ser Val Lys Gly Ser Asn Leu 

2329 TIC AAG TOG GAA GTG CGT CTT TOG GAT GCA GAA AAC AAA GTC GTG GC3 AAT GGG ACT 

238> Phe Lys Leu Glu Val Arg Leu Leu Asp Ala Glu Asn Lys Val Val Ala Asn Gly Thr 

2386 GGG ACC CAG GGC CAA CTT AAG GTG CCA GGT GTC AGC CTC TGG TGG CCG TAC CTG ATC 

257>Gly Thr G!n Gly Gin Leu Lys Val Pro Gly Val Ser Leu Trp Trp Pro Tyr Leu Met 
2443 CAC GAA CGC CCT GCC TAT CTG TAT TCA TOG GAG GTG CAG CTG ACT GCA CAG ACG TCA 

276> His Glu Arg Pro Ala Tyr Leu Tyr Ser Leu Glu Val Gin Leu Thr Ala Gin Thr Ser 

BamHI (2537) 

2500 CTG GGG CCT GTG TCT GAC TTC TAC ACA CTC CCT GTG GGG ATC CGC ACT GIG OCT GTC 

295>Leu Gly Pro Val Ser Asp Phe Tyr Thr Leu Pro Val Gly Me Arg Thr Val Ala Val 

2557 ACC AAG AGC CAG TTC CTC ATC AAT GGG AAA CCT TTC TAT TTC CAC GCT GTC AAC AAG 

314> Thr Lys Ser Gl n Phe Leu Me Asn Gly Lys Pro Phe Tyr Phe His Gly Val Asn Lys 

2614 CAT GAG GAT GCG GAC ATC CGA GGG AAG GGC TTC GAC TOG CCG CTG COG GTG AAG GAC 

333>His Glu Asp Ala Asp Me Arg Gly Lys Gly Phe Asp Trp Pro Leu L»u Val Lys Asp 

2671 TTC AAC CTG CTT CGC TGG CTT GGT GCC AAC GCT TTC CGT ACC AGC CI*C TAC CCC TAT 

352>Phe Asn Leu Leu Arg Trp Leu Gly Ala Asn Ala Phe Arg Thr Ser His Tyr Pro Tyr 

2728 GCA GAG GAA GTG ATG CAG ATC TOT GAC CGC TAT GGG ATT GTG CTC ATC GAT GAG TGT 

371*Ala Glu Glu Val Met Gin Met Cys Asp Arg Tyr Gly Me Val Val Me Asp Glu Cys 

2785 CCC GCC GTC GGC CTG GOG CTC CCG CAG TTC TTC AAC AAC GTT TCT CIG CAT CAC CAC 

390>Pro Gly Val Gly Leu Ala Leu Pro Gin Phe Phe Asn Asn Val Ser Leu His HI a His 
2B42 ATG CAG . GTG ATG GAA GAA GTC GIG OCT AGG GAC AAG AAC CAC CCC GCG GTC GTG ATG 

409>Met Gin Val Met Glu Glu Val Val Arg Arg Asp Lys Asn His Pro Ala Val Val Mst 

2899 TGG TCT GTC GCC AAC GAG CCT GCG TCC CAC CTA GAA TCT GCT GGC TAC TAC TTC AAG 



428>Trp Ser Val Ala Asn Glu Pro Ala Ser His Leu Glu Ser Ala Gly lyr Tyr Leu Lys 
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Figure 5 Continued 

BatXI (2972) 

2956 ATG GTG ATC GCT CAC AOC AAA TCC TTG GAC CCC ItXT CGG CCT GTG ACC TTT GTG AGC 

447>Mst Val He Ala His Thr Lys Ser Leu Asp Pro Ser Arg Pro Val Thr Phe Val Ser 

3013 AAC TCT AAC TAT GCA GCA GAC AAG QGG GCT CCG TAT GTG GAT GTG ATC TGT TOG AAC 

466>Asn Ser Asn Tyr Ala Ala Asp Lys Gly Ala Pro Tyr Val Asp Val He Cy» Leu Aan 

3070 AGC TAC TAC TCT TGG TAT CAC GAC TAC GGG CAC CTG GAG TTG ATT CAC CTG CAG CIG 

485>Ser Tyr Tyr Ser Trp Tyr His Asp Tyr Gly His Leu Glu Leu lie Glr Leu Gin Leu 

3127 GCC ACC CAG TIT GAG AAC TOG TAT AAG AAG TAT CAG AAG CCC ATT AIT CAG AGC GAG 

504>Ala Thr Gin Pha Gl u Ash Trp Tyr Lys Lys Tyr Gl n Lys Pro 11 e lie- Gin Ser Gl u 

BamHI (321 6) 

31B4 TAT GGA GCA GAA ACG ATT GCA GOG TIT CAC CAG GAT CCA CCT CTG ATC; TTC ACT GAA 

523>Tyr Gly Aia Gi u Thr He Ala Gly Phe Hit Gin Asp Pro Pro Leu Mer Phe Thr Gl u 
3241 GAG TAC CAG AAA AGT CTG CTA GAG CAG TAC CAT CTG GGT CTG GAT CAA AAA CGC AGA 

542>Glu Tyr Gin Lys Ser Leu Leu Glu Gin Tyr Hie Leu Gly Leu Asp Gin Lys Arg Arc, 

3298 AAA TAT GTG GIT GGA GAG CIC ATT TGG AAT TTT GCC GAT TTC ATG ACT GAA CAG TCA 

561>Lys Tyr Val Val Gly Glu Leu He Trp Aan Phe Ala Asp Phe Met TV Glu Gl n Ser 

3355 CCG ADG AGA GTG CTG GQG AAT AAA AAG GGG ATC TTC ACT CGG CAG AG** CAA CCA AAA 

580>Pfo Thr Arg Val Leu Gly Asn Lys Lys Gly lie Phe Thr Arg Gin Arg Gin Pro Lya 

3412 AGT OCA GCG TTC CTT TIG CGA GAG AGA TAC TGG AAG ATT GCC AAT GAA ACC AGG TAT 

599>Ser Ala Ala Phe Leu Leu Arg Glu Arg Tyr Trp Lys He Ala Asn Qu Thr Arg Tyr 
3469 COC CAC TCA GTA GCC AAG TCA CAA TOT TTG GAA AAC AGC CCG TTT ACT TOA G GTCGAG 
6ie»Pro His Ser Val Ala Lys Ser Gin Cya Leu Gl U Asn Ser Pro Phe Thr 
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3' beta casein 



Figure 6 
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MUTATIONS TO B-GLUCURONIDASE 



406 




B-glucuronidase 



TAACICCCAA TCTTCICTCT CCA QZ3 QD3 GtB QCE GIG CAG GGC GOG ATC CIG TftC 

Aia Ala Ala Ala Val 



intron 



B -Glucuronidase 



Linker 



The gapped heteroduplex method was used to remove: 

(a) the hinge region 

(b) the hinge and ala-ala-ala-ala-val linker, fusing the CH1 
region to B-glucuronidase. 



oligo used: ACTCAQCTCA CGCATCCACC 
for #403 intron intron 



oligo used: GACAAGAGAGTT CAGGGCGGGATG 
for #406 CH2 B-glucuronidase 



Figure 8 
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